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PREFACE 


The Archaea are clearly recognizable as a unique and 
interesting group of organisms for many important 
reasons. They have distinct molecular characteristics 
that clearly distinguish them from the Bacteria and 
the Eucarya, and evolutionary studies highlight the 
quintessential role that the Archaea have played in 
shaping all life on Earth. Many archaea are extre- 
mophiles and have been responsible for rewriting the 
textbooks with regard to the degree that an organ- 
ism can tolerate, or even thrive, in abiotic extremes. 
As a result, the combined implications of their evolu- 
tionary role and their capacity to thrive in environ- 
mental extremes have expanded the horizons for the 
astrobiology community in searching for extraterres- 
trial life. Properties that are characteristic of or 
unique to the Archaea span from fundamental bio- 
chemical signatures (e.g., the glycerol-1-phosphate 
lipid backbone) and metabolic pathways (e.g., meth- 
anogenesis) to fundamental properties of their “life- 
style” (e.g., no pathogenic archaea are known). While 
the Archaea, Bacteria, and Eucarya represent coher- 
ent lineages, they are also chimeric in part. Archaea 
tend to have information-processing systems (e.g., 
DNA replication, transcription, and translation) in 
common with eucarya, while they share other fea- 
tures (e.g., metabolism) with bacteria. These charac- 
teristics provide important fuel for debates about the 
course of evolution. They also offer practical bene- 
fits for advancing studies on eucaryal organisms, in 
particular, metazoans, which are often more complex 
and difficult to study, by providing analogous biolog- 
ical targets for a broad range of studies (e.g., protein 
X-ray structures for eucaryal homologs from archaeal 
hyperthermophiles). 

Given their serious importance to biology, one 
might be surprised that suprisingly few books cover 
the biology of the Archaea to any real extent. The best 
introductory microbiology textbook that covers the 
Archaea is Brock: Biology of Microorganisms, in which 
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archaea are covered accurately, although necessarily 
briefly due to the scope of topics covered. Archaea: 
Molecular and Cellular Biology is pitched at practi- 
tioners in the field, and as a useful resource for those 
less familiar with the Archaea but who are charged 
with teaching the topic, and it is intended to be acces- 
sible to postgraduate and advanced level undergradu- 
ates who will, we might hope, be inspired to become 
the next generation of “archaeaologists.” Each chap- 
ter covers essential background, focuses on the dis- 
coveries of the twenty-first century, and concludes 
with a view of what is expected to arise in the next five 
years. The book aims to be the authoritative refer- 
ence source for the many disciplines interested in the 
Archaea. 

In many fields, in particular, in the early years of 
growth, terminology is coined by different research 
groups and gradually evolves into a sea of descrip- 
tions. The development of the Archaea as a field of 
study is no exception. We have adopted the conven- 
tions proposed by Carl R. Woese, Otto Kandler, and 
Mark L. Wheelis in their article “Towards a natural 
system of organisms: proposal for the domains Ar- 
chaea, Bacteria, and Eucarya,” published in the Pro- 
ceedings of the National Academy of Sciences of the 
United States of America, 1990 (87:4576-4579), and 
Norman Pace in his essay “Time for a change,” pub- 
lished in Nature, 2006 (441:289). This avoids compli- 
cations with the use of terms such as “archaebacteria” 
and “eubacteria,” which in essence inappropriately 
describe two different forms of “bacteria,” and the 
term “prokaryote.” 

While the Archaea have often been appreciated 
for the qualities listed above (e.g., their extremophilic 
nature), molecular ecological studies have highlighted 
the diversity and ubiquity of archaea on the planet. It 
may indeed be impossible to identify a type of habitat 
where archaea are completely absent. Traditionally, 
the taxonomy and physiology of the Archaea have 
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been described by referring to halophiles, hyperther- 
mophiles, and methanogens. While this serves a pur- 
pose, members of the Archaea know no boundaries 
and are unified by their fundamental cellular proper- 
ties. This book embraces this knowledge and is dedi- 
cated to fully describing the molecular cell biology of 
the Archaea. 

The core of the book, Chapters 3 to 18, describes 
the key cellular processes such as DNA replication, 
transcription, translation, lipids, and metabolism, 
while including a depth and completeness that in- 
cludes unique features of aminoacyl-tRNA synthesis, 
signal transduction, and posttranslational modifica- 
tion (to mention but a few of the topics). Chapter 2 is 
a large chapter that covers general characteristics and 
important model organisms. By broadly overviewing 
the ecology and physiological diversity of the archaeal 
world, the chapter provides a reference point for con- 
templating the biology of the cell. Chapters 19 to 21 
cover genomics, functional genomics, and molecular 
genetics. Although relevant aspects of these topics are 
integrated into preceding chapters, chapters 19 to 21 
provide a depth and focus that highlight the evolu- 
tionary implications, the state-of-the-art, and the 
strengths and limitations of each of the fields. The 
book concludes with two chapters that cover biotech- 
nological and biomedical applications of archaea and 
their cellular products. 

Chapters 2 to 23 have been written and edited 
with the aim of capturing each author’s individual ex- 
pertise, while generating a reference text with a syn- 
thesized style and tone. It is indeed not merely a collec- 
tion of conference proceedings. The exception to this 
general style is Chapter 1, a personalized account by 
Carl Woese of the “history of the Archaea move- 
ment.” In his chapter, “The Archaea: an Invitation to 
Evolution,” Carl offers a provocative and inspiring 


account of how the Archaea were discovered, delves 
into the implications of archaeal distinctiveness for 
the origin and evolution of life, and describes the sci- 
entific and social hurdles that needed to be overcome 
to bring the field forward to where it is today, a time 
in which the Archaea, whose existence could not have 
been guessed only a short time ago, have come of age. 

It is my hope that the many hours of effort that 
preceded the publication of this volume will have not 
been in vain, that the volume in hand will be the first 
place active researchers go to for information on the 
Archaea, and that the younger generation of biolo- 
gists will be inspired to try their hand at these organ- 
isms, whose secrets we have only begun to unlock and 
which offer so much in the way of fundamental in- 
sights and practical benefits. 

It would be ungracious of me not to acknowledge 
the help I have received along the way from the many 
people who provided assistance. It is impossible to cite 
everyone, but, in particular, I thank Greg Ferry for 
constructive and encouraging words (and great coffee) 
who helped to initiate the book and drive it along, and 
Kevin Sowers for supporting my efforts (in particular, 
with Guinness) during sabbatical in his laboratory. In- 
teractions with authors have been uplifting experi- 
ences, which I think illustrates that those in the Ar- 
chaea field share a common excitement and joy of 
discovery, and who certainly know how to pull to- 
gether when it counts. Sincere thanks to Carl Woese 
for letting me see inside his world, and for not only 
sharing rare insight but timely, inspiring, and warm 
encouragement. Thanks to Greg Payne who has pro- 
vided the critical and smooth link to the rest of the 
team at ASM Press—here’s to you all, and to the first 
edition of Archaea: Molecular and Cellular Biology. 


Ricardo Cavicchioli 
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Chapter 1 


The Archaea: an Invitation to Evolution™ 


CARL R. WOESE 


BEGINNINGS 


The discovery of the archaebacteria was serendip- 
itous, but not unexpected. In the late 1960s I had 
begun assembling the program for inferring (organ- 
ismal) genealogical relationships through rRNA se- 
quence comparisons (46). It was time to determine 
the structure of the universal phylogenetic tree (what- 
ever it might be). Molecular evolution had been on 
the scene for the better part of a decade, and a uni- 
versal framework within which to study evolution 
from the molecules on up was needed. Our initial em- 
phasis was necessarily on the microbial world (bac- 
teria, in particular), for almost nothing was known 
about microbial phylogenies. 

My objective in establishing the phylogenetic 
program, however, was not to refine bacterial taxon- 
omy per se, but to restore an evolutionary perspec- 
tive/spirit to biology. This time the focus would be on 
the evolution of the cell itself, in particular, the evo- 
lution of its translation mechanism (and the informa- 
tion-processing systems in general) (38, 41). A reduc- 
tionist discipline of molecular biology had deliberately 
ignored evolution—asserting that a fundamental un- 
derstanding of life could be acquired without consid- 
ering its evolution! This was unacceptable to me, 
anathema: a Biology that is not fundamentally evo- 
lutionary is not Biology. 

The cell’s translation mechanism (and the other 
cellular information-processing systems) are far too 
complex to have arisen in anything approaching their 
modern “fully evolved” state (38, 41). Therefore, each 
must have come from far more primitive beginnings 
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and so have undergone profound and telling evolu- 
tionary changes. To reconstruct even the final stages 
in this evolutionary progression was to come face to 
face with biological organization. (Biological organi- 
zation cannot be understood apart from its evolution. 
All biological form is ultimately complex dynamic 
process—process involuted beyond our current ca- 
pacity to understand it. In an important sense biolog- 
ical form is four dimensional.) 

How exactly to go about determining a universal 
phylogenetic tree was, then, the question. Since the 
1950s the idea that an enormous phylogenetic record 
lay buried in the sequences of macromolecules had 
been there—a record that could be read through com- 
parative sequence analysis (6, 49). At that point, 
however, the approach had been used solely with pro- 
teins—in part, because nucleic acid sequencing had 
yet to be developed; in part, because most biologists 
had convinced themselves that proteins were the only 
way in which this could be done. The enterprise had 
even been labeled “protein taxonomy” initially (6). 
Having worked with ribosomes, I thought the ap- 
proach might work with the ribosomal RNAs if only 
we had the proper method for characterizing them. 

Fortunately, that method came along in 1965— 
Sanger’s oligonucleotide-cataloging approach to RNA 
sequencing (25). The Sanger method and ribosomal 
RNA seemed the perfect combination: the method 
was powerful (yielded far more sequence information 
than anything that had preceded it), and rRNAs were 
ubiquitous and relatively easy to isolate; their se- 
quences were quite highly conserved (10). The uni- 
versal phylogenetic (genealogical) tree was only the 
first step in a journey, however: a world map, if you 
will, that helps to locate relics of the evolutionary 
past. I guess I didn’t realize how long it was going to 
take just to find that map! 


Carl R. Woese èe Department of Microbiology, University of Illinois, Urbana, IL 61801. 
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Several years were spent in tuning the Sanger 
method to suit our needs. By the early 1970s we were 
finally up and running, albeit very slowly. Initially we 
groped our way tentatively into the darkness of bac- 
terial phylogeny, starting with garden variety mi- 
croorganisms—Escherichia coli (here we had only to 
improve on what others had already done) (33); Aero- 
bacter aerogenes (46); Bacillus (half a dozen or so 
species) (45); and various others. Almost immediately 
it became apparent that we were going to need help 
to cover the full range of the bacteria—at least in any 
reasonable time frame. Many organisms would just 
be too difficult for us to grow. Enlisting the aid of the 
experts, that is, those experts who were willing to 
grow their organisms in the proper radioactive med- 
ium, was essential. Such people were hard to find, but 
once the right constellation of experts started to form, 
we were on our way in earnest—characterizing rRNAs 
from mycoplasmas, other pathogens, a host of clos- 
tridia, many lactobacilli, bacteroides, a representative 
collection of photosynthetic bacteria, and, of course, 
key species of what would become the archaebacte- 
ria, and so on. 

Of all the numerous suggestions we had gotten 
for organisms to study, the one I solicited from my 
colleague at the University of Illinois Department of 
Microbiology, Ralph Wolfe, turned out to be the most 
important. Ralph was in the process of working out 
the biochemistry of methanogenesis, which made 
it natural for him to suggest we characterize the 
methanogens. It was not the organisms per se, but the 
evolutionary conundrum they posed, that intrigued 
me. If the methanogens were taxonomically grouped 
on the basis of common methanogenic biochemistry, 
then the resulting taxon contained a wide variety of 
morphologies (including Gram stain differences) and 
growth conditions. The alternative morphologically 
based grouping would cast methanogenic biochem- 
istry to the taxonomic winds. This last is how the sev- 
enth edition of Bergey’s Manual had done it (2); but 
in the eighth edition (methanogenic) biochemistry 
had been used as the taxonomic desideratum (22). (A 
similar conundrum was posed by bacterial photosyn- 
thesis, in which case molecular analysis would resolve 
the issue differently—in favor of a reticulated bio- 
chemistry spread across the bacterial organismal phy- 
logenetic tree.) We had the technology here and now 
to settle the phylogenetic issue methanogenesis raised. 
Methanogens went to the top of the to-do list. 

The only difficulty was that, at the time—which 
at the latest was spring 1974 (the semester before 
George Fox joined the lab as my post doc)—a tech- 
nology that would allow growing methanogens read- 
ily and safely in radioactive medium had not yet been 
developed. It would soon be (1). William (Bill) Balch, 


a graduate student in Wolfe’s lab, was about to de- 
velop such a method for other reasons: he would 
grow methanogens in pressurized serum bottles, a 
technique that was not only efficient and adaptable, 
but, what was important to us, safe for working with 
radioactively labeled cultures. By 1976 it had become 
possible for the two labs to collaborate. 

When the first methanogen 16S rRNA oligonu- 
cleotide catalog rolled off the production line (June 
1976), I was stunned by what we had; and thereafter 
our entire focus was to be on methanogens and the 
new group of organisms, the archaebacteria, that they 
would come to represent. We had discovered a “third 
form of life,” a new “urkingdom”! Methanogens (and 
their yet undiscovered relatives) stood genealogically 
completely apart from the other bacteria we had so far 
characterized (which we came to call “eubacteria”) 
and from the (small but representative) cluster of eu- 
caryotes whose rRNA catalogs we had done. We were 
in for the ride of our scientific lives! 


HITCHING UP THE TEAM 


Like everyone else in those days, I had tacitly ac- 
cepted that all bacteria were “prokaryotes.” That is, 
they were all of a kind; all had stemmed from some 
common procaryotic ancestor (29, 31, 32). Suddenly 
confronted with a procaryote that didn’t seem to be 
a procaryote, I had to ask myself why I had believed 
all bacteria to be procaryotes in the first place? No 
good reason, it turned out! When analyzed scientifi- 
cally the “procaryote” didn’t wash; there was no hard 
evidence, not even sound scientific reasoning to back 
it up (31, 32). We had hard evidence in hand and 
were in the process of gathering a lot more. 

Within six months we had the rRNA catalogs of 
another five or so methanogens, with others in pro- 
duction, and all of them possessed the same anomalous 
type of rRNA (9, 11). That was it! We had shown 
methanogenesis to be monophyletically distributed. 
The exceptional variability in the group’s morpholo- 
gies and habitats was what needed explaining. 

I had shared my “eureka moment” (upon assem- 
bling the first methanogen rRNA catalog) with my 
then post doc, George Fox, who, along with Bill Balch, 
had gotten the project actually up and running. George 
and I became the first to see the phylogenetic “light,” 
to realize that there were actually three primary lines of 
descent (urkingdoms) on this planet, not two! 

Convincing ourselves was not the problem. Con- 
vincing others was. It would be a hard sell. For rea- 
sons I could not understand at the time, literally all 
biologists believed in “the prokaryote.” And it was 
not your typical scientific belief—always open to 
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question. This was dogma, unshakable doctrine! 
Some biologists would take years to overcome their 
procaryote prejudice. Some have yet to do so. Most of 
those who came to accept the three urkingdom notion 
were in fact not even genuine “converts”; while they 
accepted eubacteria and archaebacteria to be ge- 
nealogically distinct (representing two separate ori- 
gins of bacteria), they held firm to the notion that 
both of them still had the same generic cellular orga- 
nization (31, 32, 41). 

So many times I have heard that archaea are 
“still just bacteria (procaryotes)” or words to that 
effect—they have the same cellular organization! 
What does this mean? In a purely scientific sense, it 
means nothing! The resemblances between the two 
are in essence negative—features not found in eu- 
caryotic cells (viewed under a light microscope). It 
means nothing that archaebacteria and eubacteria 
both show simple rod, coccoid, or spiral shapes; it 
means nothing that both are typically much smaller 
than eucaryotic cells, that neither has any microscop- 
ically visible intracellular inclusions. All that these 
facts tell us is that neither type possesses certain fea- 
tures characteristic of eucaryotes. Nothing has been 
(or could be) said about the “common organization” 
the two supposedly share! 

Consider an analogous situation on the meta- 
zoan level. Three kinds of eyes exist among animals. 
Their similarity is defined only functionally. Struc- 
turally the three are worlds apart. Might not the same 
situation exist with regard to archaebacterial and eu- 
bacterial organizations (41)? After all, there are no 
facts to refute the idea, and the extremely gross traits 
(cited above) that all “procaryotes” were supposed to 
share could easily be rationalized in general terms 
having to do with lifestyle more than anything else. 
Its organization is the most complex attribute the cell 
has. No biologist would accept that a trait this com- 
plex could have arisen more than once. Nor would 
any biologist accept that two independently evolved 
“procaryotic” cell types would be so organizationally 
similar that their differences would be trivial, unin- 
teresting! The above metazoan example makes per- 
fectly clear that if evolution builds two complex traits 
that resemble one another in some overall functional 
sense, it cannot build them so alike on the underly- 
ing (complex) structural level that they will fail to 
show major (nonhomologic) differences. 

In sum, the notion that eubacteria and archae- 
bacteria could have evolved independently yet have 
arrived at so similar a cellular organization as not to 
be of interest to the biologist, is absurd; it contradicts 
all that we know about evolution! Microbiologists 
were simply having their cake and eating it too. More 
has to be said about the “procaryote” simplification 


and will be below, for it is pivotal to microbiology’s 
development (or lack thereof) in the twentieth century. 


MAPPING OUT THE TERRITORY 


Having discovered what we thought to be a third 
primary line of descent (represented at the time solely 
by methanogens), we needed to explore its evolution- 
ary implications. The theory of genealogical descent 
would per force be put to its greatest test yet! Evolu- 
tion as we know it demanded that the new “ur- 
kingdom” display two fundamental characteristics: 


1. It should comprise a number of major organ- 
ismal groups very different from one another in their 
overall phenotypes. (My colleague Norman Pace used 
to refer to this as “kingdom level” diversity.) 

2. The phenotypic features integral to the warp 
and woof of the basic archaeal cell design should 
strongly distinguish it from the eubacterial cell design 
(and vice versa)—features at least as striking as those 
that distinguish plants from animals, for example. 


Fortunately, a small cadre of scientists had quickly 
come around to the three-urkingdom perspective (in 
addition to our collaborators Wolfe and Balch). In 
particular, I would mention the Germans Otto Kan- 
dler, Wolfram Zillig, and Karl Stetter. Kandler, whose 
studies at the time centered on (bacterial) cell walls, 
had visited the University of Illinois in early 1977 
(prior to the publication of our work). He was one 
of the first outsiders whom we told about the three- 
kingdom concept, and the only one to understand 
and accept it upon first hearing! Wolfram Zillig, a 
group leader at the Martinsried Max Planck Institute, 
was an expert in DNA-dependent RNA-polymerases. 
Both he and Kandler had encountered certain anom- 
alies in their own research that could be neatly ex- 
plained on the basis of a three-urkingdom hypothesis. 
Karl Stetter, who had trained with both Kandler and 
then Zillig, also appreciated the significance of the ar- 
chaebacteria—and proceeded to fashion a career as 
the greatest of the “swashbuckling” archaea hunters of 
the twentieth century; I wouldn’t be surprised if most 
of the isolated archaeal species today have come from 
his laboratory. These three individuals were initially 
the most effective of all in promoting the study of the 
archaebacteria, especially among Europeans. They 
were a strong second front, as it were, in part because 
Europeans had been less affected by the procaryote 
propaganda that so dominated (micro)biology in 
North America. 
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WHERE ARE THE COUSINS? 


Methanogens offered no clues as to their cousins. 
The unusual coenzymes that methanogenic biochem- 
istry utilizes seemed to occur almost nowhere else. 
With Kandler’s visit to Urbana, however, there came a 
break. Kandler already knew (as did others working 
in the cell wall field) that the walls of extremely 
halophilic bacteria did not contain peptidoglycan, 
which at the time had been taken as a defining char- 
acteristic of the “procaryote” (31). Moreover, he had 
just determined that the wall of one methanogenic 
species was also aberrant (though not of a halobac- 
terial type) (15). 

The walls of the would-be archaeabacteria were 
not the most important clue, however, for they would 
turn out to be nonhomologous among themselves. 
But the walls were pointing us in the right direction. 
The critical clue was to be atypical lipids possessed by 
the halophiles—ether-linked and branched chain (not 
ester-linked and straight chain, as the lipids of bacteria 
and eucaryotes are). These strange lipids would ulti- 
mately be found in all the archaebacteria character- 
ized. They were the first of the universal phenotypic 
characteristics found exclusively in the archaebacteria! 

Ether-linked lipids had been discovered by Mor- 
ris Kates in the early 1960s (16, 20). Kates took them, 
as did everyone else at the time, to be adaptive, aris- 
ing independently in certain organisms that grow in 
extreme environments. Shortly thereafter, the micro- 
biologist Thomas Brock isolated two strange new 
types of bacteria that inhabited thermophilic niches, 
Sulfolobus and Thermoplasma (the former discovered 
also in Italy, and named Caldariella) (3, 7, 8). Another 
Tom, Thomas Langworthy, went on to show that the 
lipids of Brock’s new isolates were of the ether-linked 
type but somewhat different from the extreme halo- 
phile lipids: Thermoplasma and Sulfolobus had 
“tetra-ether” lipids, which were back-to-back cova- 
lently linked versions of the halophile (diether) lipids 
(20, 21). 

The pieces of the archaebacterial puzzle now be- 
gan to fall into place. Not only were the phenotypi- 
cally diverged cousins of the methanogens beginning 
to show up, but so were the traits common to all ar- 
chaebacteria. The rRNA sequences that defined the 
archaebacteria, though phylogenetically very telling, 
should not be considered traits in the true (informa- 
tive) phenotypic, organismal sense. These sequences 
(in effect gene sequences) tell you nothing about the 
overall phenotype, either of the group as a whole, or 
of the subordinate subgroupings therein. A tree based 
on rRNA sequence comparisons is basically just an 
abstract genealogical tree. Look at it as only a road 


map interconnecting towns. The contours of the map 
begin to appear, the towns bustle with people, only 
when the phenotypically informative traits become 
known. Those features common (and unique) to all 
archaebacteria then reveal the overall landscape of 
the archaebacterial world. The first common features 
to appear were: 


1. The unusual ether-linked, branched-chain 
lipids. 

2. The characteristic subunit structures of their 
DNA-dependent RNA polymerases (more similar to 
their eucaryotic than their [eu]bacterial counter- 
parts) (48). 

3. An unusual and characteristic modified ver- 
sion of the so-called common arm sequence in tRNAs, 
the archaeal version of which uniquely had the se- 
quence V--C-G, rather than the customary eubacte- 
rial and eucaryotic T-W-C-G (12). Also, the so-called 
D-loop of archaebacterial tRNAs contains no “D,” 
dihydrouridine (12). 

4. Cell walls that lack peptidoglycan, a negative 
trait, equivalent, say, to our metaphorical landscape’s 
having no lakes (15). 

5. A largely unique spectrum of antibiotic sen- 
sitivities vis-a-vis the eubacteria (4). 


Now, the rRNA analyses began to tell us even more: 
the new archaebacterial urkingdom was larger than 
we first thought. While the halobacteria and Thermo- 
plasma lay within the phylogenetic confines defined 
by the methanogens (44), the rRNAs of the Sulfolob- 
ales showed these organisms to be only a sister group 
to the others. (The in-group species would be called 
the Euryarchaeota; the Sulfolobales in turn would be 
the Crenarchaeota [43].) And, a notably deep (and 
still not understood) phylogenetic divide separated 
the two groups (39, 44). 


REACTIONARY SCIENCE AT ITS BEST 


To this point my narrative may have felt like 
emerging from a dark forest (ignorance) to find be- 
fore you a beautiful panoramic vista, with the clear 
sky of a bright future above you—a sky clear except 
for one small cloud on the horizon (a small cloud 
called the “procaryote”). As it came nearer, however, 
that small cloud would darken and broaden into a vi- 
olent storm (as we will now see). 

Things had gone well with the archaebacteria to 
begin with—before their formal presentation to the 
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scientific community and the public at large. By the 
fall of 1977 the time for a formal debut had come; 
now the world would see the new urkingdom, the 
hitherto unknown “third form of life.” I had alerted 
both the NSF and NASA, my funding agencies, that 
what we were about to publish might be of particular 
interest to them, might make a little splash—the storm 
had yet to materialize. The two agencies were indeed 
interested and decided to make a joint statement to the 
press on the day the first of our two articles appeared 
in print. That would be the 3rd of November 1977. I 
realized something bigger than anticipated was afoot 
when one or two days before the formal press release 
the phone started ringing and a reporter from The 
New York Times showed up in my office. 

November 3rd. There they were! The archaebac- 
teria! Right there on the front page of The New York 
Times (and a lot of other newspapers around the 
globe). 

But unbeknownst to anyone at the time, a telling 
coincidence had occurred, one that set the stage for 
an epic drama, a struggle for the soul of biology—a 
drama in which today we are all actors. November 
3rd just so happened to be the date chosen by the 
then president of the U.S. National Academy of Sci- 
ences, Philip Handler, to release an official statement 
heralding the dawn of the cloning era (signaled by the 
recent cloning in bacteria of the gene for the growth 
hormone somatotropin). As it can now be appreci- 
ated, that fortuitous coincidence was a foretaste of 
the first skirmish in the ideological struggle between 
what would become the biomedical-industrial com- 
plex and resurgent evolution. Our “third form of 
life,” which touched on one of the deepest chords in 
human nature (i.e., where we came from), completely 
wiped the press release announcing the era of “Man- 
the-medical-miracle” off the front pages of the pa- 
pers. I was overjoyed at the public’s appreciation of 
our work (9, 42)! 

The celebratory mood, the joy at the public and 
general scientific reaction, was short-lived, however. 
As alluded to above, resistance to the concept had al- 
ready surfaced to some extent in the microbiological 
community. Now the storm hit full force! On the day 
The New York Times announced our discovery of a 
“third form of life” on its front page, my colleague 
Ralph Wolfe received a telephone call from his friend, 
the Nobel Laureate Salvador Luria (whom he did not 
initially identify), an upset Salvador Luria. Accord- 
ing to Wolfe, Luria told him in no uncertain terms to 
publically dissociate himself from this scientific fakery 
or face the ruination of his career. In a recent re- 
counting of the episode (47), Ralph said he was so 
humiliated he “wanted to crawl under something and 


hide,” but he did tell Luria that supporting evidence 
for the claim had been published. When informed 
that the scientific journal was the Proceedings of the 
National Academy of Sciences (a fact that had been 
mentioned in The New York Times article), Luria, a 
bit befuddled, had blurted out that the October issue 
had just arrived but he hadn’t looked at it yet. Fortu- 
nately, Ralph left for a planned out-of-town family 
gathering the next day (47) and thereby escaped any 
further humiliation. 

As you might expect, I saw the episode and its 
overall significance differently. How could this Luria 
fellow have the temerity to excoriate his friend and 
my colleague like that? What pulpit was he preach- 
ing from? It appears that he had blustered at Ralph 
something to the effect that: “Everybody knows that 
all bacteria are procaryotes; there can’t be any such 
thing as a ‘third form of life!’” Irony of ironies! As 
time (and the diligence of a particular scientific his- 
torian) have shown, the fakery lay not in our work, 
but in the procaryote concept itself (26): it is now 
clear that the “procaryote” was mere guesswork 
(more on this below.) But in the heyday of the pro- 
caryote, which this was, the true believers were out to 
pillory us for our heresy: how dare we slander their 
Procaryote! 

That day in November 1977 had laid bare a fun- 
damental structural problem that affected all of biol- 
ogy, and the condition of microbiology was only the 
most obvious symptom of it. The name of that prob- 
lem was molecular biology—under whose hegemony 
twentieth-century biology had developed. Molecular 
biology’s world view was that of classical physics—a 
world view that is inimical to biology. Without doubt 
spectacular progress occurred under molecular biol- 
ogy’s hegemony, but the result has been a superhigh- 
way to a biological dead end—unless one looks at 
bioengineering as the ultimate goal of biology! Our 
encounter with microbiologists over the archaebacte- 
ria was by no means the parochial taxonomic squabble 
one might think. The issue was not confined to micro- 
biology, nor even to biology as a whole. The ultimate 
issue is the relationship between biology and society— 
their conjoined future. The archaebacteria were telling 
twentieth-century biology (microbiology in particu- 
lar) that it had forsaken its roots and had no more 
chance of becoming mankind’s ultimate view of biol- 
ogy than an oak tree has of growing on a flat rock. 
Such advice ran afoul of biology’s conventional wis- 
dom, and biology, microbiology in particular, was re- 
acting accordingly. 

As I stated earlier, one of my goals in establish- 
ing the program of phylogenetic analysis in the first 
place had been to restore a sadly lacking evolutionary 
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perspective/spirit to biology. I had long cautioned that 
biology was on the verge of being taken over by “sci- 
ence mongers and technological adventurists.” The 
great fear was that biology would cease being a basic 
scientific discipline, becoming merely an olio of sepa- 
rate subdisciplines, whose common focus was applied 
problems—the complaints of the society, as it were. 
This is acceptable and proper up to a point, of course, 
but beyond that point the framework of basic biology 
crumbles and the basic science turns into engineering. 
And this must not happen! The “procaryote” was both 
the symbol and the linchpin of biology’s structural 
problem. (Try to imagine how different twentieth- 
century biology might have been had microbiologists 
stayed true to their birthright, i.e., the world of mi- 
croorganisms—rather than turning their discipline 
into a theme park for molecular gawking.) 

The discovery of the archaea had provided one of 
those special moments when things are put into grand 
perspective: it was a wake-up call to microbiology and 
provided a time for biology as a whole to reassess its 
head-long dash into reductionism. Introspective op- 
portunities such as this are rare, ephemeral, and tran- 
sient. In contrast the grip of convention is strong, per- 
sistent, and compelling. The discovery of the archaea 
was a moment that should have been grasped. Some 
of us tried hard to bring this about. If only microbiol- 
ogy would change back to the kind of discipline that 
Beijerinck knew and was trying to develop, there would 
be hope. But there were so many road blocks, so many 
citadels to conquer. The immediate road block, of 
course, was the mighty “prokaryote,” which seemed 
to deny everything that biology is—organism, ecology, 
evolution. From the late 1970s onward I percieved de- 
fanging the procaryote as absolutely essential if mi- 
crobiology (and biology) were to be set right again, 
and I devoted much of my energy to it. 


DECONSTRUCTING THE PROCARYOTE 


The prokaryote-eukaryote dichotomy of the 1960s 
was not an astonishing formulation. It neither took 
scientists by surprise nor opened up new avenues 
of research. It was a rhetorical discovery, one which 
involved summoning lost words from a foreign sci- 
entist in an obscure journal and synthesizing con- 
temporary data based on molecular biology and 
electron microscopy. It was greeted with accolades 
not because of its novelty, but because it seemed to 
close, once and for all, long-simmering issues. It con- 
firmed and clarified the differences between bacteria 
and blue-green algae on the one hand, and viruses 
and the cells of protists, fungi, plant and animal, 
on the other. The belief in the monophyly of bacte- 
ria, moved by historical inertia, and strengthened 


by molecular biology’s model organism [E. coli], re- 
sulted in a crowning achievement: the legitimizing 
of the new kingdom, Monera, or of the superking- 
dom, Prokaryota. The prokaryote-eukaryote di- 
chotomy thus marks a signal moment in the devel- 
opment of biology (26). 


And, I will add, a moment that brought microbiol- 
ogy’s conceptual development to a halt! Stanier and 
van Niel’s “procaryote” paper of 1962 (32) intended 
to, and did, accomplish three things for the field. (i) It 
stated outright the problem that all microbiologists 
intuitively knew to exist from the outset: “. . . the 
abiding intellectual scandal of bacteriology has been 
the absence of a clear concept of a bacterium . . . the 
problem of defining these organisms as a group in 
terms of their biological organization is clearly still 
of great importance, and remains unsolved” (32)— 
thereby exposing the discipline’s roots in organism 
(biological organization). (ii) It severed these roots by 
declaring the old (phylogenetic) approach to the 
problem unworkable and unnecessary—false asser- 
tions both. (iii) It bunched together what remained 
and stuffed it into the narrow vase of reductionism— 
having left out those parts that wouldn’t fit. As a re- 
sult, we have before us today a conceptual bouquet 
that is thoroughly wilted—unified in name only, the 
“discipline of microbiology.” 

Microbiology’s historical development and the 
climate in which it occurred underlie and define the 
whole issue. That history effectively began in the last 
half of the nineteenth century, when microbiology 
was first attempting to gather itself into a coherent bi- 
ological discipline. The principals whose works best 
epitomize this phase in microbiology’s history are 
Ferdinand Cohn, Louis Pasteur, Martinus Beijerinck, 
and Serge Winogradski. 

It was self-evident to Cohn that bacteria stood 
apart from other forms of life. He distinguished the 
“Schizophytes” as the “.. . first and simplest division 
of living beings . . .” (5, p. 201). Beijerinck also sub- 
scribed to this general view, but from a remarkably 
enlightened perspective. For Beijerinck the microbial 
world was “... that part of nature which deals with 
the lowest limits of the organic world, and which con- 
stantly keeps before our minds the profound prob- 
lem of the origin of life itself” (quoted in reference 
34). Beijerinck has been pigeonholed historically as 
the father of microbial ecology (34), but that narrow- 
ing of him hardly does the man or his scientific per- 
spective justice. In his day Beijerinck (unlike most of 
his successors) was a complete microbiologist/biolo- 
gist. He did not concern himself solely with culturing 
individual bacterial species and studying them physi- 
ologically (although he had made revolutionary con- 
tributions to the area by his introduction of enrich- 
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ment culturing methodology). By his own account, 
his “. . . approach can be concisely stated as the study 
of microbial ecology, i.e., of the relation between en- 
vironmental conditions and the special forms of life 
corresponding to them” (34). Beijerinck stood for a 
microbiology that was biology in the full sense, which 
could synthesize from its various roots a greater 
whole: its medical origins (Koch’s world); its founda- 
tion in practical ecological concerns (fermentation, 
Pasteur’s somewhat reductionist medical world); 
Beijerinck’s own pioneering work on geomicrobiol- 
ogy and virology; and the rich and tangled involve- 
ment of microbes with the whole of Nature (so well 
captured by the image of the Winogradski column— 
or the hot spring colorations at, say, Yellowstone 
Park). For Beijerinck microbial ecology was definitely 
not a secondary pursuit, derivative of biochemistry. 
His microbiology was synthetic and holistic, not re- 
ductionistic, mechanistic, or derivative. 

Beijerinck’s perspective might well have been the 
broadest and most coherent foundation that micro- 
biology could ever have known. But circumstances all 
worked against the development of his holistic view- 
point: The methodologies required to usefully realize 
it would not be in place for roughly half a century; 
there was no phylogenetic framework within which 
to structure and develop it, and (unlike developmen- 
tal biology) microbiology had no conceptual sea wall 
to staunch the surge of reductionism—a reductionism 
that denied holism and, so, evolution per se. Finally, 
there was the overarching “linear” perspective that 
had worked so well in the world of classical physics, 
and with which classical physics (through molecular 
biology) was now infecting biology (40). The imme- 
diate success of molecular biology’s confident reduc- 
tionism gained it a string of converts among concep- 
tually struggling microbiologists. 

It comes as no surprise, then, that Beijerinck’s 
appointed successor, Kluyver, who was trained as a 
(natural products) chemist, would take microbiology 
in a doggedly reductionist direction. The “unity of 
biochemistry” became microbiology’s new leit motif, 
and that plus the perceived need to elucidate all the 
nuanced variations on the central biochemical themes, 
constituted the territory to be conquered (17, 18, 36). 
The new approach to microbiology that Kluyver’s 
“Delft School” pioneered was known as “compara- 
tive biochemistry” (17). The biology, however, lay 
mainly in the name. (A holistic biologist would be 
fully justified in calling this a “pifiata paradigm”—for 
at the end of the day the “pifiata” (the organism) is 
gone, but there remains plenty of [molecular] candy 
for the kids!) 

Every scientific discipline reflects a particular sci- 
entific mythology, which is its axiomatic foundation. 


In microbiology’s case that mythology needed to be 
grounded in “organism,” in biological organization. 
This had not happened. By the mid-twentieth century 
the pifiata paradigm, then in full force, had yielded 
bacterial biochemistry galore—all manner of new 
biochemicals and biochemical pathways had been un- 
covered and the underlying unity of biochemistry 
seemed assured. Still, one of the best-recognized micro- 
biologists of his day, Roger Stanier, found a need to 
denounce the state of microbiology (quoted above) as 
an “abiding scandal” (32). He had trenchantly sensed 
that microbiology’s scientific mythology had yet to be 
properly established. 

Unfortuntely, Stanier and van Niel’s purpose in 
writing The Concept of a Bacterium (32) was not to 
address the long-standing foundational issue. It was, 
as said, to get rid of it, to cover it up—and in so doing 
(appear to) provide microbiology with its badly 
needed (organismal) touchstone. The formal defini- 
tion of “prokaryote” read as follows: “The distinctive 
property of bacteria and blue-green algae is the pro- 
caryotic nature of their cells” (32), from which the 
statement “. .. we can therefore safely infer a com- 
mon [prokaryotic] origin for the whole group in the 
remote evolutionary past” must follow (31). No more, 
no less! 

In the term’s obscure past (26), the prefix “pro-” 
had obviously expressed evolutionary relationship of 
some sort between bacteria and the “higher forms” of 
life. Both Cohn and Beijerinck had seen bacteria as 
primitive and perhaps predating the “higher forms”— 
as did many other microbiologists then and later. 
Evolutionary relationship was not intended, however, 
by the “new” procaryote; it was not a part of its new 
wardrobe. An evolutionary perspective had effectively 
been proscribed! Thus the “pro-” in “procaryote” 
would now remain unexplained in the term’s revived/ 
revised usage (32). Note that “procaryote” is for all 
intents and purposes absent from the literature before 
the 1960s (26). The word “procaryote” fails even to 
appear in what at the time was microbiology’s pri- 
mary teaching instrument, the first (1957) edition of 
The Microbial World (30). Yet the term and concept 
were the centerpiece in the second (1963) edition 
(31). Within a decade of its announcement “procary- 
ote” was on every biologist’s lips; it had become con- 
ventional wisdom (29). What was this bizarre turn 
of events all about? 

It was all about “The King is dead; long live the 
King!” It was about regime change, conceptual tran- 
sition. Microbiology’s now moribund attempts to pull 
itself together as an organismal discipline (in keeping 
with Beijerinck’s vision) had altogether failed. Micro- 
biologists had been unable to develop the needed 
framework of a “natural classification.” Microbial 
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ecology had been stopped in its tracks by its inability 
to define niches in organismal terms—to representa- 
tively sample, isolate, or phylogenetically characterize 
the bacteria therein. Bacterial evolution (phylogeny) 
had never gotten beyond the hopeful speculation 
stage—although that in itself had given microbiology 
a certain sense of direction. Yet the whole enterprise 
was now declared null and void—a waste of time 
(26, 29, 30). 

Ironically, this cynical, dismissive attitude that 
“procaryote” embodied developed just when the mol- 
ecular methodology for solving the problem of bac- 
terial relationships was coming onto the scene, and 
the coming of a new field of “protein taxonomy” had 
already been proclaimed (6). Organismal thinking, 
however, now went against reductionist microbiol- 
ogy’s grain. 

Microbiologists of an earlier day had struggled 
to define bacteria holistically—as their botanist and 
zoologist colleagues had done so successfully with 
plants and animals. However, microbiology’s new re- 
ductionist outlook increasingly trivialized the emer- 
gence that was “organism” (biological organization). 
And by midcentury, the time had come to eliminate 
organismal thinking entirely and move on (26). 

The slight of hand by which this transition was 
accomplished is remarkable in its simplicity. Bacteria 
were flat-out declared all to be of a kind—no scien- 
tific discussion needed! Placing them all into a newly 
named highest level taxon, Procaryotae (22), served 
reflexively to justify the coup. Classification of bacte- 
rial species, which henceforth would serve only utili- 
tarian purposes, was relegated to the dull pedantry 
of determinative classification (31, 35). 

The key biological problem that bacteria pre- 
sented, their biological organization, was simply filed 
away under “unimaginable complexity.” No one 
knew, or could possibly know anything about it! 
Nevertheless, that organization was declared to be 
the same, of the same genre, in all bacteria. What a 
convenient conceptual “black hole”! The organism, 
the biology, evolution, disappeared into the impene- 
trable sink of organizational sameness! All of the bi- 
ological concerns disappeared, like peas in a shell 
game. In this way microbiology could “know itself” 
to be an organismal discipline in camera, as it were, 
without having to admit it (through its actions) to the 
reductionist world around it! 

Microbiologists will some day have to admit that 
in accepting the procaryote, they were rejoicing in the 
Emperor’s New Clothes. And the archaea, like the 
child in the nursery tale, had shouted the obvious. 

The famed physicist Erwin Schroedinger con- 
demns what he calls “guesswork solutions” to scien- 
tific questions. Science must reject these because such 


a “... fake removes the urge to seek after a tenable 
answer.” Adding, “So efficiently may attention be di- 
verted that the answer is missed even when, by good 
luck, it comes close at hand” (28). There you have it: 
the procaryote! 

Nevertheless, only by seeing the “procaryote” 
for what it really is—scientifically valid or not—can 
microbiology get back on its naturally intended 
course. The whole intent of the “procaryote” was to 
supply bacteriology with the semblance of a hun- 
gered-for mythology, the axiomatic foundation the 
discipline required to know and present itself as a co- 
herent organismal discipline. Among other things, 
this dynamic would explain the dogged loyalty mi- 
crobiologists have shown to the concept, their rapid 
and unquestioning acceptance of it, as well as their 
immediate rejection (a decade and a half later) of the 
three-urkingdom concept (which had exposed their 
exposed “Emperor”); it would explain microbiolo- 
gists’ consistent adherence to the modes of thought 
and academic organization inherent in the simplistic 
“procaryote/eucaryote” dichotomy—not to mention 
microbiology’s persistent blindness with regard to 
evolution in general. 


RECAPITULATION AND CONCLUSION 
Where Are We? 


This narrative has followed in both fact and feel- 
ing the discovery of the archaea. It began on a note 
of joy, the feeling of power, potential, and flat-out 
awe that accompanies any major discovery. The ar- 
chaea represented a radical departure in the way we 
looked at life on this planet. Biologists had thought 
that all life on earth arose originally from one of two 
primary lines of descent, either the eucaryotic or the 
so-called procaryotic. The archaea were here to set 
that right. There were three, not two, primary lines of 
descent, and the procaryote was a false concept— 
which never had any scientific evidence to support it 
in the first place, and had never been put to proper 
scientific test. 

Throughout the history of microbiology bacte- 
ria had been seen as a grouping of some sort. But that 
grouping, which customarily went under the names 
Schizomycetes or Monera, was in effect gradistically 
defined: bacteria were those “simple” organisms that 
were not eucaryotic in cellular type. The eucaryote 
may have been phylogenetically defined, but not the 
procaryote. Phylogenetic relationships among the bac- 
teria had been impossible to determine (until the ad- 
vent of molecular sequencing), and this made for a 
chronic foundational problem in the discipline. As 
Roger Stanier pointed out in the 1960s, microbiol- 
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ogy did not know itself, for it had no biological con- 
cept of the organisms it studied (32). Conceptually 
ostracized by molecular reductionism, microbiolo- 
gists had sought to resolve this foundational problem 
and move on—that is, join the reductionist ranks. 
There was no need to resolve the matter scientifically 
in this climate; it could simply be done away with 
rhetorically, by declaration! 

Thus, by fiat, all bacteria shared some hypothet- 
ical common “procaryotic” cellular organization and, 
therefore, stemmed from a common procaryotic an- 
cestry (31). The purpose of the procaryote guesswork 
had merely been to make microbiology’s foundational 
problem go away, by effectively taking the organism 
out of consideration. 

The discovery of the archaea would settle this 
foundational issue scientifically, not rhetorically. Mol- 
ecular sequencing evidence showed that the bacteria 
were indeed only a gradistically defined group. Phylo- 
genetically, “bacteria” comprised two distinct group- 
ings, not one (41). Microbiology’s foundational issue 
clearly had not been settled; the nature of the organ- 
ism must remain a central concern for the discipline. 

The message of the archaea was only superficially 
one of organismal discovery. There was a deeper, more 
important message embedded therein: namely, that in 
adopting reductionist fundamentalism (40)—which 
denied the basic nature of both organisms and evolu- 
tion—microbiology (and biology as a whole) had been 
going down the wrong road for at least half a century, 
and it was time to stop and reassess the matter. 

Though reluctant at first, microbiology did even- 
tually come around to accepting the archaea. How- 
ever, if one were to consider this acceptance a victory 
for the archaeal perspective, it had to be a pyrrhic 
one. All that microbiology had done was to add a 
new high-level grouping to its taxonomy. Microbiol- 
ogy’s world view had not changed one iota in the 
process! There was still no recognition that microbi- 
ology is a true evolutionary, organismal discipline! 
The creative energy inherent in the archaeal discovery 
had been siphoned off—their call for biological refor- 
mation (microbiology in particular) drained of its 
spirit. For me the archaea (at this point) were a fail- 
ure. They had effectively changed nothing! 

As a discipline with no self-image, no proper ax- 
iomatic foundation, microbiology could have been 
only a passive bystander in the titanic struggle that 
was twentieth-century biology. For most of the twen- 
tieth century the discipline simply drifted with the 
conceptual tides. After several decades of cajoling I fi- 
nally gave up on trying to reform microbiology (the 
discipline as academically constituted, that is). The 
problem of the microbial world clearly remains, but 
the inflexible and ineffective formal discipline of mi- 


crobiology that emerged from the twentieth century 
seems incapable of dealing with it. 

The problem of the microbial world has long ago 
transcended microbiology and is on its way to be- 
coming a forefront for biology as a whole in the 
twenty-first century. Microorganisms dominate the 
planet’s biosphere. They make up most of the planet’s 
living biomass. They are the metabolic engines of the 
planet. They were essential to the evolution of all 
macroscopic life on earth (both the mitochondrion 
and the chloroplast originated as bacterial endosym- 
bionts). Without bacteria all life on this planet would 
disappear in short order—the same cannot be said for 
macroscopic life! We need to assimilate this message 
fully and study the microbial world accordingly. 

Biology today is in conceptual crisis. I do not 
consider microbiology to be the primary problem, 
however. Molecular reductionism is. Microbiology— 
a properly constituted microbiology—is the basis for 
a solution. Beijerinck showed us the beginnings of 
such a microbiology a long (and long forgotten) time 
ago. Here was (could have been) microbiology in the 
full sense—not that pale residue of it one sees later 
in the reductionist pifata paradigm. For Beijerinck 
(see above) the organism was no mere sum of its 
parts; first and foremost the study of bacteria was 
ecological, one in which the organismal community 
and its environment were a paramount concern. At 
the end of the day microbiology (as microbial ecol- 
ogy) was a study in “. . . the profound problem of the 
origin of life itself” (34)—and this would include the 
evolution of the organism-environment dichotomy in 
the first place. 

Beijerinck’s microbiology was ultimately a study 
in the emergence of biological organization. Yet emer- 
gence was something proscribed by the new sum-of- 
the-parts world of molecular biology. No reduction- 
ist, Beijerinck seemed to view that which emerges 
from a given level of biological organization as just as 
important, just as fundamentally biological as what 
underlay and gave rise to the given level in the first 
place. 

If there is to be a basic biology in the future (in- 
stead of an applied, engineering discipline) then biol- 
ogy must be freed from the shackles of reductionism. 
(This is especially important in microbiology’s case, 
where many of its current practitioners still view bio- 
chemistry, metabolism, as the essence of a bacterium— 
to them bacteria are still bags of enzymes!) 

If you were to ask a thorough-going molecular 
reductionist today what fundamental problems in bi- 
ology remain to be solved, the answer you are most 
likely to get would be “none.” How could there be 
any such when The Gene (in the molecularist’s sense) 
had answered the great question of biology, “What 
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is Life?” (27). To the molecular coterie of the mid- 
twentieth century, molecular biology seemed “the bi- 
ology to end all biology” (14)! 

Going into the twentieth century, biologists con- 
sidered evolution to be the foundational concept of 
biology. Molecular biology was quick to replace evo- 
lution with The Gene, however. For these reduction- 
ists, evolution, far from being significant to any fun- 
damental understanding of biology, was no more 
than an epiphenomenon (14)! As it now stands, how- 
ever, it is not that evolution is epiphenomenological, 
it is that a reductionist view of biology is incomplete 
and misleading. 

The twentieth century gave rise to two concep- 
tual revolutions in science, one in biology—the effects 
of which all biologists working today can feel in their 
scientific bones—and (a more subtle and philosophi- 
cally profound) one in physics. Both concerned the re- 
ductionist world view. The biological revolution pro- 
mulgated reductionism (40). The one in physics 
brought it into question. Incredible! Just as molecular 
biology was rebuilding biology along reductionistic 
lines, the discipline of physics (which had given rise to 
[classical] scientific reductionism in the first place) 
was in the process of dismantling it. This is surely one 
of the great ironies in biology’s history—a grand 
twentieth-century biological cathedral erected on an 
old and crumbling nineteenth-century foundation! 

In replacing evolution (descent with variation) as 
the fount and focus of biology by the gene, molecu- 
lar biology had replaced a (nascent) process perspec- 
tive by a mechanistic materialistic one. If evolution 
(biology) is anything, it is a quintessentially nonlinear 
process—the very aspect of reality that could not be 
encompassed by the linear mathematics that had de- 
fined and delimited the world view of classical physics. 
Thus, the molecularist cannot even begin to recognize 
evolution, much less accord it the scientific respectabil- 
ity that is its due. And that is why the molecular par- 
adigm now finds itself aground on the shoals of (bio- 
logical) complexity, biology’s true nature. 

Almost needless to say, evolution, though anti- 
thetical to twentieth-century molecular reductionism, 
is in complete harmony with the complex dynamic 
perspective in physics today. Both are studies in emer- 
gent organization. Biologists are going to see major 
changes in their world as the synthesis of the studies 
of biology and complex systems—which will become 
the biology of the twenty-first century—comes into 
being. 


The Dawning of a New Biology 


Have you noticed that nonlinear mathematics 
(pictorially presented) has an innate aesthetic appeal 


(I am thinking here of those delightful books of frac- 
tal representations one can see on so many book- 
shelves these days)? These types of pictures are not 
new; only the underlying mathematics is. Cultural im- 
ages such as these go back to the dawn of mankind. 
Nonlinear mathematics seems somehow to resonate 
deeply in the human psyche. 

To me the juxtaposition of aesthetic feeling and 
scientific understanding here means one thing! In the 
(near) future, when evolution is couched in a nonlin- 
ear mathematical framework, representations of evo- 
lution will be not only spectacular in their simplicity, 
but compelling in their beauty. There is a deep (yet 
to be appreciated) connection between evolution and 
our study thereof. 

Although the discovery of the archaea had not 
been transformative of microbiology, a seed had been 
planted and it quietly began to grow. Interest in iso- 
lating novel and interesting bacteria was revived. Bac- 
terial taxonomists began to see rRNA sequence as the 
gold standard for species identification and classifica- 
tion. It seemed possible for the first time that the day 
would come when all the bacterial species could be 
brought together in one grand, comprehensive classi- 
fication—all those that could be cultivated, that is. 
Microbiologists had long thought that the reserve of 
undetected bacterial novelty had been pretty well ex- 
hausted, that most of the high-level bacterial taxa had 
been discovered. The archaea, of course, had shaken 
that belief somewhat, but even with their discovery 
nothing spectacular, world shaking, was expected to 
come from further studies in bacterial diversity. How 
wrong we were! 

While I was visiting my colleague Norman Pace 
in Denver back in the early 1980s, he suggested a 
novel idea for how to identify bacteria (24). Since 
bacteria could now be identified on the basis of their 
rRNAs alone, why bother growing the organism in 
the laboratory? Why not take the rRNAs (or their 
genes) directly from the setting in which the organ- 
isms naturally grow? That way, you would not be at 
the mercy of the vagaries of laboratory cultivation 
and so could, in principle (by isolating enough of the 
same type of gene directly from a given environment), 
have an unbiased representation of all the bacterial 
types in that environment. 

Such a powerful methodology changes even the 
way organism/environment questions are initially 
framed—just as Beijerinck’s enrichment culturing had 
done in its day (34). The difference between the new 
and the old methodology would be like shopping 
with or without access to a shopping catalog. If you 
had such a catalog, you might even decide you wanted 
to look for something other than what you originally 
intended (or in addition to it). Very powerful! What’s 
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more, Pace told me, all of this could be done simply 
by adapting some of the currently available molecular 
genetic technology. 

Pace’s idea was no innocent extension of the bac- 
terial characterizations we and others had been do- 
ing (based on cultured strains). It was a whole new 
way of exploring the organism-environment relation- 
ship. Pace had questioned what all microbiologists 
had previously taken for granted—namely, that to 
identify and know anything about the relationship of 
a particular bacterial species to other bacteria, that 
species had first to be grown in the laboratory. Cultur- 
ing methods imposed crippling limitations on what 
organisms could be isolated. There was no way one 
could study organisms in their natural settings in any 
comprehensive manner (24). Pace told me he planned 
to go up to Yellowstone National Park as soon as 
possible to get some samples to test out his idea. I 
tagged along. Unbeknownst to any of us, Pace’s 
methodology (and the extensions of it to come) were 
opening up the future of twenty-first-century biology. 

By the early 1980s my laboratory and others had 
characterized several hundred bacterial strains by the 
rRNA method. In 1987 I would review the field (39) 
and conclude that there are about a dozen major bac- 
terial groupings (phyla) that we know of, and that 
when all was said and done several more might sur- 
face. How wrong that projection was! To date Pace’s 
methodology has produced in the range of 100,000 
unique rRNA species from various environments, 
and these represent on the order of 50 new bacterial 
phyla. But there is no sign of letup—almost all of the 
separate environmental rRNA isolates have no 100% 
sequence matches anywhere else in the rRNA se- 
quence database. In other words, even this large a 
number of rRNA reads has been a remarkably sparse 
sampling of the total number of “species” that are out 
there in Nature. So large a number of unique species 
of rRNA with no end in sight is starting to question 
the very concept of bacterial species itself! The advent 
of environmental genome sequencing has done noth- 
ing to resolve the situation. It has only deepened the 
conundrum and brought it more sharply into focus. 


The Microbiology Yet To Come 


Given the context of this chapter I will be rela- 
tively brief. Molecular reductionism is now spent as 
a conceptual force and has settled into being a most 
useful body of technology. Microbial biology in the 
meanwhile has undergone a conceptual and method- 
ological revolution of its own, freeing itself from its 
self-inflicted intellectual confinement. 

The future of biology lies in microbial ecology. 
This is not the faux ecology that microbiologists 


knew throughout most of the twentieth century. That 
was a “real world” defined by microbial biochem- 
istry, in which microbial ecology was merely a diver- 
tissement. Neither is it the “traditional,” academically 
constituted, (macro) ecology, which never grounded 
itself in the reality of basic twentieth-century biology 
and so ignored the sleeping giant of microbial ecology, 
and still does so. It is Beijerinck’s microbial ecology: 
fundamental in its own right; evolutionary to its core. 
What most distinguishes Beijerinck from later-day mi- 
crobiologists is the feeling for the emergence of the 
whole. And that is what will distinguish twenty-first- 
century biology from the biology of the molecular era. 

It is remarkable to see the confluence that is 
bringing about the new biology. Molecular biology, 
despite its aversion to evolution, is moving in an evo- 
lutionary direction—compelled by its own technology 
(principally the capacity to sequence genes). Micro- 
biology, after focusing itself for the better part of a 
century on mechanism (to the near exclusion of mi- 
crobial ecology and a total disregard of evolution) is 
also being compelled by developments in these “sec- 
ondary” areas to reorient itself precisely in their di- 
rection. A synthesis of all biological disciplines around 
an evolutionary focus is in the making. 

The difference between what twentieth-century 
biology was and what twenty-first-century biology 
will be is subtle (abstract) but profound. The twentieth- 
century biologist’s goal was to see biology as mecha- 
nism, to understand it in terms of the static, material 
aspect of Reality. Form was everything! In this view 
evolution becomes an epiphemonenon—a property, 
an incidental characteristic of (mechanistic) biological 
systems. The twenty-first-century biologist has a dif- 
ferent goal: to see biology as arising out of complex 
dynamic process. In this view biological form be- 
comes phenomenological—a characteristic of a Uni- 
versal Evolutionary Process. To put it in the vernacu- 
lar, “Evolution came first!” 

The question “What is life?” is precisely the ques- 
tion “What is evolution?” 
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Chapter 2 


General Characteristics and Important Model Organisms 


ARNULF KLETZIN 


INTRODUCTION 


The studies conducted by Carl Woese and co- 
workers in the 1970s opened up new perspectives in 
biological taxonomy and the origin of species (14, 15, 
96, 434, 435, 443) (see Chapter 1). Ribosomal 16S 
(and 18S) RNA (rRNA) and DNA (rDNA) became 
the most important molecule for constructing phylo- 
genetic dendrograms that spanned the entire world of 
living organisms (e.g., 357). The studies obliterated 
the dichotomous distinction between noncomposite 
“prokaryotes” and composite “higher eukaryotes,” 
and allowed a third “urkingdom” to be identified: the 
Archaea (Archaebacteria) (96, 434). The concept of a 
third domain of life explained puzzling biochemical 
observations, such as the unusual composition of cell 
walls, membrane lipids, and RNA polymerase that 
were present in some microorganisms (96). The third 
domain was initially poorly accepted and vigorously 
challenged in the scientific community (50, 220, 221). 
However, the concept of the Archaea was advanced 
through seminal studies performed by Otto Kandler, 
Ralph S. Wolfe, Karl O. Stetter, and Wolfram Zillig 
(e.g., 436), which included, in particular, the isolation 
and characterization of numerous archaeal (and bac- 
terial) extremophiles by Karl Stetter and Wolfram Zil- 
lig (see Chapter 1). 

Initial studies on the Archaea defined three phys- 
iologically different groups: methanogens, haloarchaea, 
and sulfur-dependent thermophiles (96). The under- 
standing of the physiological and phylogenetic diver- 
sity of Archaea has improved as the number of iso- 
lates has increased. All cultivated archaea up to the 
year 2005 (212) may be regarded as extremophiles 
(with respect to their growth temperature, osmolarity, 
or pH) or methanogens (many of which are also ex- 
tremophiles). Natural habitats for extremophilic ar- 
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chaea include terrestrial hot springs of volcanic origin, 
hydrothermal vents on the ocean floor, and a diverse 
range of highly saline, acidic, alkaline, and anaerobic 
environments. In contrast, mesophilic and psychro- 
philic archaea are ubiquitous and present in most ter- 
restrial and aquatic environments that have been an- 
alyzed (51, 353). As a consequence, Archaea exhibit a 
diverse range of cellular adaptations. For example, 
energy conservation by sulfur oxidation or reduc- 
tion is a hallmark of the (hyper-) thermophilic ar- 
chaea, an adaptation that links to the abundance of 
sulfur in volcanic environments where they pro- 
liferate (159). In these largely light-independent 
ecosystems, sulfur-dependent and chemolithoau- 
totrophic archaea are among the most important 
primary producers. In contrast, mesophilic and psy- 
chrophilic archaea are present in rivers, ice, and 
lakes (51, 353). Methanogens are ubiquitous in 
freshwater sediments, wetlands, and animals. Mem- 
bers of the Crenarchaeota are present in cold and 
temperate soils worldwide. Molecular ecology stud- 
ies indicate that Archaea fulfill important ecologi- 
cal roles in most ecosystems. In contrast to the 
abundance of molecular ecological data, very few of 
these archaea (with the possible exception of the 
methanogens), have been isolated and cultivated in 
the laboratory (51, 353). 

This chapter provides an overview of the Ar- 
chaea and some of their morphological, physiologi- 
cal, biochemical, and molecular properties, provid- 
ing a platform for the more specific chapters on 
archaeal molecular cell biology. This chapter also in- 
troduces model organisms and systems that have been 
used to study fundamental properties and principles 
of archaeal biology, in addition to those that have 
served as models for understanding the biology of 
more complex eucaryal cells. 
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PHYLOGENY OF ARCHAEA 
AND THE ORIGIN OF LIFE 


Due to their cellular organization being similar 
to Bacteria, Archaea were misclassified until compar- 
ison of their molecular traits revealed that they form 
a fundamentally distinct domain of life (see Chapter 1). 
The tripartite tree based on 16S rDNA sequences is 
supported by biochemical data and is now accepted 
by the majority of scientists as being the best repre- 
sentation of the evolution of the three domains of life. 
Nevertheless, the three-domain concept has been 
repeatedly challenged. The “eocyte tree” separated 
the domain Archaea into several kingdoms (220) and 
eventually led to the proposal of the “ring of life” 
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(369). A different proposal classified Bacteria as the 
only “Prokaryotes” and divided them in two sub- 
kingdoms, and placed the Archaea with the Eucarya 
in a kingdom of “Neomura” (50). 

The coherence of Archaea as an evolutionary 
group, and the distinction of the Euryarchaeotal Cren- 
archaeota kingdoms are evident at the molecular level 
(Fig. 1). Numerous common biochemical features 
and the results of comparative genomics support the 
overall architecture of the Archaea tree (40, 41, 95, 
328) (see Chapter 19). It is supported by many stud- 
ies using amino acid sequences of individual proteins 
or combined sets of functional protein classes for tree 
construction, such as ribosomal or translation pro- 
teins (e.g., elongation factors or RNA polymerases) 
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Figure 1. 16S rDNA phylogenetic dendrogram of the Archaea. Branches representing cultivated strains (boldface lettering 
and dark polygons); branches representing sequences determined from molecular ecology studies (white polygons). Alterna- 
tive branch points of Nanoarchaeum and Methanopyrus (dashed lines) (40, 42). Reproduced with modifications from Nature 


Reviews Microbiology (353) with permission of the publisher. 
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(40, 41, 72, 95, 369) (see Chapters 8 and 19). The va- 
lidity of these trees, and hence the interpretations 
about evolution, is based on the expectation that 
genes from central information-processing pathways 
are less likely to be successfully exchanged with for- 
eign counterparts (even homologous proteins) than 
genes for metabolic enzymes, which may more easily 
be replaced by functional analogs. Most archaeal pro- 
teins required for replication, transcription, and 
translation have homologs with the highest level of 
sequence identity to proteins within the Archaea. In 
addition, these proteins share more similarity with 
eucaryal proteins than with bacterial proteins. In con- 
trast, archaeal proteins involved in metabolic path- 
ways and regulatory functions are generally more 
similar to Bacteria than to Eucarya. 

Crenarchaeota and Euryarchaeota are not only 
defined by branching orders of dendrograms, but also 
by distinct and nontrivial differences in the replicative 
functions within members of these two kingdoms (95, 
187) (see Chapters 3 and 19). Some genes are absent in 
one kingdom and not in the other kingdom (Table 1). 
For example, Euryarchaeota possess heterodimeric 
DNA polymerases of the D family, the ssDNA-bind- 
ing protein RPA, a homotrimeric sliding clamp pro- 
tein PCNA, histones of the eucaryal type, and the 
bacterial-like cell division protein, FtsZ. In contrast, 
the Crenarchaeota harbor the ribosomal protein $30, 
the ssDNA-binding protein SSB, and a heterotrimeric 
PCNA. 

Despite the overall consistency of the archaeal 
16S tree, there are still many puzzling features. Most 
prominent is the seemingly illogical branching order 


of the Euryarchaeota. The anaerobic Archaeoglobales 
and the essentially aerobic Thermoplasmatales and 
Halobacteriales disrupt the methanogens into several 
taxonomic groups, a result that is supported by most 
protein-based comparisons (Fig. 1) (41). In contrast, 
all methanogens, Archaeoglobales and Halobacteri- 
ales possess an RNA polymerase B subunit that is en- 
coded by two subunit genes, while the Thermoplas- 
matales do not (Fig. 2) (40). Other archaea possess a 
single large subunit. It is highly improbable that this 
split occurred twice in evolution as the separation al- 
ways occurs at the same position in the gene. There 
are several possible explanations: (i) the dendrograms 
do not give the historically correct branching order; 
(ii) the potential for methanogenesis, which requires 
an extraordinary number of genes for the enzymes 
and for the cofactor biosynthesis, has been either ac- 
quired or lost repeatedly in archaeal evolution. Sup- 
port for this latter possibility comes from the obser- 
vation that some bacterial methylotrophs oxidize 
methane using reverse methanogenesis, and the genes 
are very likely to have been acquired from Archaea 
(55); (iii) a split in the RNA polymerase B subunit 
genes could have been rejoined in the Thermoplas- 
matales. At present it is neither clear which (if any) 
of these alternatives is correct nor whether the 
methanogens are monophyletic. 

The difficulty in inferring the phylogeny of 
methanogens is due in part to the placement of the 
“difficult” and possibly fast-evolving species, Metha- 
nopyrus kandleri, which has a tendency to move its 
branch point in phylogenetic trees depending on the 
dataset analyzed (Fig. 1) (40, 371) (see Chapter 19). 


Table 1. Replication proteins in Crenarchaeota and Euryarchaeota” 
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Helicase loader 
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Cdc6/ORC 


Cdc6/ORC 
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MCM (single homolog) 


Pyrococcus: 1 

Cdc6/ORC 

MCM (single or multiple homologs) 
Cdc6/ORC 

RPA (one or three subunits) 


Primase® Primase (two subunits) 
Polymerase? PolB (one or multiple homologs) 
DNA sliding clamp‘ PCNA (heterotrimer) 
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Lagging strand maturation 
Cell division protein 
Ribosomal protein S304 
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Flap endonuclease 1 

RNase H 

ATP-dependent DNA ligase 
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Primase (two subunits) 

PolB (one or multiple homologs) 
PolD (heterodimer) 

PCNA (homotrimer) 

RFC (two subunits) 

Flap endonuclease 1 

RNase H 

ATP-dependent DNA ligase 
FtsZ 


*Data compiled from Kelman and White (187). MCM, minichromosome maintenance; ORC, origin recognition complex; PCNA, proliferating cell nuclear 


antigen; REC, replication factor C; RPA, replication protein A (Eucarya-like); SSB, ssDNA-binding protein (Bacteria-like). 
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Figure 2. Subunit composition and phylogenetic dendrogram of RNA polymerase amino acid sequences. (A) Virtual SDS-polyacrylamide gel showing the 
subunit sizes of the subunits of the different RNA polymerases. Homologous subunits are shown in identical shading. Reproduced with modifications from 
Proceedings of the National Academy of Sciences of the United States of America (225) with permission of the publisher. (B) Phylogenetic dendrogram of 
concatenated RNA polymerase amino acid sequences and presence of a single subunit or two-subunit B homologs. Reproduced with minor modifications from 
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Methanopyrus is always placed as one of the most 
deeply rooting Archaea in 16S rDNA dendrograms 
(for example, deeper than Archaeoglobus) separate 
from other methanogens. It branches in different po- 
sitions when comparing concatenated sets of all tran- 
scription versus translation proteins from archaeal 
genomes, where it groups with the Methanobacteri- 
ales (Fig. 2) (40). In addition, it has the split of RNA 
polymerase B subunit gene. The “difficult” phylogeny 
may relate to the approximately fourfold higher “in- 
del” (insertions and deletions in otherwise conserved 
genes) frequency in the Methanopyrus genome com- 
pared with other Archaea. This correlates with the 
long branches in dendrograms and is consistent with 
a much faster rate of evolution than other Archaea. 
If this interpretation is correct, it indicates that the 
evolutionary rate is not constant for (micro-) organ- 
isms. This would be consistent with recent construc- 
tions of the archaeal tree, which report that the 16S 
rDNA does not give reliable branching points for 
many higher taxa (328). 

A fast rate of evolution has also been suggested 
for Nanoarchaeum equitans (see Chapter 19). N. eq- 
uitans is a symbiotic, hyperthermophilic archaeon 
that has the smallest known genome of cellular or- 
ganisms (481 kbp) and lives in close association with 
a sulfur-dependent anaerobic hyperthermophile, Ig- 
nicoccus (see Chapter 14). N. equitans was proposed 
to be a member of a new, deeply branching kingdom, 
the Nanoarchaeota (152), which diverged before the 
CrenarchaeotalEuryarchaeota split. However, rDNA 
and protein phylogenies give conflicting outcomes 
(42). It has been proposed that genome reduction led 
to a deletion of many presumed “vital” genes, leading 
to a parasitic rather than a symbiotic relationship 
with Ignicoccus. The genome displays extensive re- 
arrangements including the virtual absence of oper- 
ons, which are otherwise a feature of many archaeal 
genomes (255) (see Chapters 6 and 8). N. equitans 
possesses many genes in common with members of 
the Euryarchaeota and more than half of the protein- 
encoding genes have best BLAST hits with eur- 
yarchaeal species. N. equitans may represent a highly 
derived and fast-evolving euryarchaeal lineage related 
to the Thermococcales (Fig. 1) (42, 255, 328). The 
distinctiveness of the rRNA sequence of N. equitans 
may have arisen as a result of the combined effects 
of adaptation to hyperthermophily, symbiotic 
lifestyle, genome reduction, and rapid evolutionary 
change (42, 429). 

Members of the fourth, deep-branching lineage 
of the Archaea, the Korarchaeota, have been enriched 
from hot environments, but pure isolates have not 
been cultivated (19). Most inference about members 
of the Korarchaeota has been derived from metage- 


nomic studies (277). A sequencing project of a Ko- 
rarchaeota-containing community is under way, and 
this will provide a great deal more information about 
this kingdom in the near future (http://www.jgi.doe 
.gov/sequencing/why/CSP2006/korarchaeota.html). 

Many hyperthermophiles grow chemolithoau- 
totrophically with inorganic energy sources using 
CO, as the sole carbon source. In addition, most hy- 
perthermophiles (Archaea and Bacteria) have deep 
branching points in phylogenetic dendrograms, and 
most have short branch lengths (Fig. 1). These find- 
ings have led to a hypothesis that the last universal 
common ancestor(s) (LUCA) was a_hyperther- 
mophilic chemolithotroph, and that life originated in 
hot environments, such as submarine hydrothermal 
vents (e.g., 386, 419). This popular but also contro- 
versial hypothesis has excited other scientists inter- 
ested in the origin of life, as it has important impli- 
cations for prebiotic evolution and for the properties 
of the first cellular organisms; the debate about the 
hypothesis is heated and ongoing. Support for a “hy- 
perthermophilic origin of life” came from hypothe- 
ses that invoked metabolism evolving on iron sulfide 
(FeS) surfaces during prebiotic times (autocatalytic 
anabolist hypothesis), and from experiments showing 
that some of these reactions are chemically possible 
(151, 183, 420). However, many reasons have been 
used to argue against the “hyperthermophilic origin 
of life” hypothesis (360), including the following: 


e The G+C content of rRNA from hyperther- 
mophiles is high (average, 60%), irrespective 
of their average genome G+C content (e.g., 
many Sulfolobales have ~35%). The high 
G+C bias in rRNA genes of hyperthermo- 
philes leads to long-branch attraction artifacts 
(and other effects) in dendrogram calculations 
that may affect the reliability of the tree (433). 
Protein phylogenies do not always support the 
16S rDNA trees that place hyperthermophiles 
at the root of the tree (40, 42, 202). 

To cope with temperature effects, hyperther- 
mophiles have developed several other modifi- 
cations besides 16S rRNA composition that 
help to protect against thermal denaturation. 
For example, the degree of tRNA modification 
increases with increasing optimal growth tem- 
perature (Topt) (283). Similarly, the degree of 
cyclization of membrane lipids increases with 
temperature (see “Membranes” below) (107). 
Many hyperthermophiles also contain signifi- 
cant amounts of thermoprotective secondary 
metabolites, such as cyclic bisphosphoglycerate 
(in some methanogens) (235, 364) and dimyo- 
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inositol phosphate (in Pyrococcus spp.) (359). 
All of these modifications require specific bio- 
synthetic pathways. It seems improbable that 
these modifications were present at early stages 
of cellular evolution, or in LUCA, and were 
subsequently lost during adaptation to lower- 
temperature environments. It may be expected 
that these mechanisms of modification devel- 
oped over evolutionary time in response to 
adaptation to hyperthermophily. 

The half-life of some important biomolecules, 
including NAD(P)(H) and ATP, decreases dra- 
matically with increasing temperature and de- 
creasing pH, thereby decreasing the likelihood 
that life could have first evolved under these 
conditions (63). 

High temperatures may cause reactions to pro- 
ceed in a less controlled fashion than lower 
temperatures, thereby making it more difficult 
for microorganisms to evolve at hot rather 
than cold temperatures. 


A recent, alternative hypothesis is that life evolved in 
stratified seafloor hydrothermal mounds, where mi- 
crocompartments the size of bacterial or archaeal cells 
were formed from FeS precipitates (256), and which 
may have served as scaffolds for cell formation prior 
to the presence of lipid vesicles. According to this hy- 
pothesis, organic precursors were synthesized abioti- 
cally at high temperatures, whereas cellular life forms 
evolved in inorganic FeS microcompartments in warm 
(not hot) zones of submarine hydrothermal vents 
(Fig. 3). This hypothesis takes into account that (i) the 
structure of the minerals in the long-living “Lost City” 
hydrothermal vent field are percolated by warm (40 to 
90°C), gradually cooling vent fluids, and (ii) FeS pre- 
cipitates with a similar structure were detected at an- 
cient hydrothermal sites (337), and their formation 
was simulated in the laboratory (394). These 1- to 
100-um-wide microcompartments would provide a 
confined space with catalytically active surfaces 
(nickel/iron), which eliminates the need for lipid mem- 
branes to enclose the cellular precursors. It is hypoth- 
esized that temperature, pH, mineral, and redox gra- 
dients within the larger structures would create the 
dynamic disequilibria required for metabolic reac- 
tions, and populations of viruslike RNA molecules, 
encoding small numbers of proteins, would be the 
agents of variation and selection (213). The sorting 
of genetic elements among different compartments 
could result in the proliferation of increasingly com- 
plex molecular ensembles and the selection of entities 
that had achieved replicative advantages. According 
to this hypothesis, LUCA was an inorganically housed 


assemblage of expressed and exchangeable genetic el- 
ements. Escape from these confinements was possible 
only after the evolution of DNA replication and 
membrane biosynthesis. Archaea and Bacteria (and 
probably other, now extinct lineages) could have es- 
caped separately, which would explain some of the 
fundamental differences between both domains 
(213). Many aspects of this hypothesis are compati- 
ble with the “autocatalytic anabolist” and the “RNA 
world” hypotheses. It is difficult to reconstruct this 
environment and test it experimentally, and as a re- 
sult the field will remain open for debate and specu- 
lation. 


GENERAL CHARACTERISTICS OF ARCHAEA 


The overall cellular architecture of Archaea is 
similar to Bacteria. The number of known morpho- 
logical types of Archaea is less than in Bacteria. For 
example, there are no multicellular stages with cellu- 
lar differentiation, no mycelia-forming archaea simi- 
lar to streptomycetes, or structures similar to the peri- 
plasmic flagella and the flexible body structure of 
spirochetes. On the other hand, members of the Ar- 
chaea do possess morphological features with no 
counterpart in the Bacteria. These include the amoe- 
balike Thermoplasma and Ferroplasma cells (115), 
flat, irregular-shaped Haloferax spp., and rectangular 
haloarchaea reminiscent of a postage stamp (Halo- 
quadratum or “Walsby’s square bacterium,” Fig. 4) 
(421). The metabolic diversity of Archaea appears 
similar to that of Bacteria, although the present un- 
derstanding is incomplete due to incomplete sampling 
of cultivatable isolates. With the notable exception of 
methanogenesis, almost all metabolic pathways dis- 
covered also exist among Bacteria. Photosynthetic 
bacteriorhodopsin (BR), which was thought to be pres- 
ent only in haloarchaea, has recently been detected 
in planktonic bacteria (22, 23), whereas “classical” 
photosynthesis with any type of chlorophyll has not 
been found in any archaea. 

A number of unique molecular properties that 
characterize Archaea were recognized in seminal 
studies from several laboratories in the 1970s and 
1980s (see Chapter 1). These properties helped to 
substantiate that Archaea are fundamentally different 
from Bacteria despite their similar cellular organiza- 
tion (reviewed in reference 436); they include: 


e the presence of phytanyl ether instead of fatty 
ester lipids in the membranes. 

e the absence of canonical peptidoglycan and a 
frequent use of proteinaceous S layers. 
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Figure 3. Hypothetical scenario for the origin of living cells from (in)organic precursors. Archaeal and bacterial cells are shown 
escaping from within naturally formed inorganic metal sulfide-based compartments in a 3.8-Ga-old hydrothermal vent. The 


compartments (1 to 100 mm in diameter) and vent structures are schematic and not drawn to scale. LUCA, last universal 
common ancestor. Reproduced with modifications from Trends in Genetics (213) with permission of the publisher. 
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Figure 4. Cell shapes of various archaea. (A) Scanning electron micrograph of Pyrodictium cells within a network of extracel- 
lular cannulae (388). (B) Phase-contrast micrograph of Haloquadratum walsbyi with gas vesicles. Photograph courtesy of 
T. Hechler, Darmstadt. (C) Electron micrograph of a freeze-etched Metallosphaera sedula cell showing the hexagonal lattice 
of the S layer (157). (D) Ultrathin section of a Methanothermus fervidus cell (387). (E) Ultrathin section of Methanogenium 
carici cells (330). (F) Ultrathin section of Pyrodictium abyssi cells (95). Bars, 1 um. Panels A and C to E reproduced from 
Bergey’s Manual of Systematic Bacteriology with permission of the publisher. Panel F reproduced from Theoretical Popula- 


tion Biology (95) with permission of the publisher. 


e the complexity of the RNA polymerase. 

e the observation that elongation factor G is 
ADP-ribosylated by diphtheria toxin. 

e the use of unmodified initiator methionine in 
translation, similar to Eucarya. 


A general overview of the characteristic properties of 
archaeal cells is provided below. 


Membranes 


The presence of phytanyl ether instead of fatty 
acid ester lipids in the cytoplasmic membranes is one of 
the most characteristic features that is present in all 
Archaea (see Chapter 15). The ability of archaeal 
membranes to function effectively under severe envi- 
ronmental stress, e.g., gradients in excess of five pH 
units [see “(Thermo-)acidophilic archaea: Thermoplas- 
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matales and Sulfolobales,” later in this chapter] (250), 
has prompted intensive efforts to characterize the na- 
ture and biogenesis of membrane lipids. Archaea have 
2,3-di-O-alkyl-sn-glycerol (diether lipids, C20 or C25) 
made of isoprenoid units, which are analogs of con- 
ventional diacylglycerides, but with the opposite 
stereochemistry to Bacteria (1,2-sn-glycerol) (107, 
181, 292). In addition, many archaea have a wide 
spectrum of glycerol or nonitol dialkyl (C49) tetraether 
lipids, which form a monolayer instead of a bilayer 
membrane, and have no ester lipid counterpart (107). 
The tetraether monolayers exhibit unparalleled resis- 
tance to proton transfer and maintain rigidity at high 
temperatures (250). However, the distribution of di- 
and tetraether lipids in Archaea does not simply cor- 
relate with any particular growth characteristic (e.g., 
temperature) (reviewed in reference 250). Most, but 
not all hyperthermophiles contain tetraether lipids. 
The “hyperacidophiles” (pH optimum, 2.5 or lower) 
comprising all Thermoplasmatales and some of the 
facultative autotrophic Sulfolobales, are the only 
group with an almost exclusive use of this lipid type, 
whereas heterotrophic Sulfolobales (pH optimum, 
2.5 to 3.5) (125, 250) contain predominantly diether 
lipids (227). The isoprenoid chains in Archaea can be 
unsaturated or cyclized and contribute to changes in 
membrane fluidity. The average cyclization number of 
hydrocarbon chains can increase with growth tem- 
perature (107, 181). 

The isoprenoid biosynthesis utilizes the meval- 
onate pathway, similar to animals, whereas the re- 
cently discovered nonmevalonate pathway is essential 
in plants, apicomplexan parasites, and many bacte- 
ria (39, 80). The hydroxymethylglutaryl-CoA reduc- 
tase (HMGR) is the rate-limiting enzyme in the 
mevalonate pathway. There are two paralogous classes 
of HMGR, a eucaryal/archaeal and a bacterial class. 
Whereas most archaea contain the eucaryal/archaeal 
class, the Archaeoglobales and Thermoplasmatales 
use the bacterial class, an observation that is inter- 
preted as an example of lateral gene transfer (see 
Chapter 15) (38). The competitive HMGR inhibitor 
mevinolin or its precursor Lovastatin is used as selec- 
tion marker in haloarchaeal transformation systems 
(see Chapter 21) (222, 223). 

Biological membranes need to be permeable to 
water to enable the buildup of osmotic pressure within 
the cell, and to maintain the water balance. Water 
permeability is mediated by aquaporins, ubiquitous 
transmembrane proteins belonging to the family of 
major intrinsic proteins (MIPs) (see Chapter 16). 
MIPs facilitate the passive transport not only of water 
but also small neutral solutes (e.g., glycerol) across 
cell membranes (82). Only one archaeal aquaporin, 
AqpM from Methanothermobacter marburgensis, 


has been described and structurally characterized 
(216, 233). It has sequence and structural similarity 
to eucaryal and bacterial aquaporins and exhibits 
high stability against denaturation. The aquaporin 
promotes water but not glycerol permeability, sug- 
gesting that it might keep neutral solutes inside the 
cell. Homologs are found in most but not all archaeal 
genomes. It is not clear how water transport is medi- 
ated in Archaea that do not contain this porin. 


Cell Walls and Extracellular Structures 


The absence of murein in the cell walls (96), and, 
with the exception of Ignicoccus (see “Desulfurococ- 
cales” and Fig. 19, later in this chapter), the absence of 
a periplasmic space are features that distinguish Ar- 
chaea from Bacteria. Only two specialized taxa of 
Archaea have a polysaccharide cell wall, whereas the 
majority have proteinaceous S layers (176) (see Chap- 
ter 15). The Methanobacteriales, one of orders of 
methanogens (see Chapter 13), stain gram positive 
and they are the only Archaea with cell wall compo- 
nents similar to peptidoglycan (see Chapter 15). The 
cell wall consists of pseudomurein, a B-1,3-linked 
polymer of N-acetylglucosamine and N-talosaminglu- 
coronic acid cross-linked with oligopeptides (176). 
Among the haloarchaea, Halococcus species possess 
a thick cell wall made of a complex, highly sulfated 
heteropolysaccharide, which contains the unusual 
components N-acetyl gulosaminuronic acid and N- 
glycyl-substituted glucosamine moieties. Halococcus 
cells do not require high salt concentrations in the me- 
dia to maintain cellular integrity in contrast to other 
haloarchaea (351, 383). This might explain the en- 
durance of Halococcus cells in atypical, low-salt habi- 
tats (313). The related Natronococcus occultus has a 
cell wall with a significantly different structure con- 
sisting of repeating units of a glycoconjugate with 
poly-(L-glutamine) (279). 

S layers of glycoproteins can represent the sole 
cell wall component in Archaea and can provide a 
high level of rigidity to the envelope (79, 261) (see 
Chapter 14). Planctomycetales are the only Bacteria 
known to lack peptidoglycan and to contain a pro- 
teinaceous S layer similar to Archaea (239). Thermo- 
plasma and Ferroplasma lack a cell wall and have a 
cytoplasmic membrane with embedded glycoproteins 
and lipoglycan (31, 88, 226). Thermoplasma grows 
in hypotonic media. Large amounts of glycoproteins 
within the membrane are indicative of a network that 
maintains cellular integrity and may represent an evo- 
lutionary product of an ancestral S layer. 

Archaeal S layers are made of large proteins an- 
chored in the cytoplasmic membrane and form regu- 
lar, symmetrical arrays when viewed by electron mi- 
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croscopy (64) (see Chapter 14). S-layer proteins as- 
sume a hexagonal symmetry in most cases (Fig. 4), al- 
though several examples of tetragonal symmetries are 
known (302, 303, 354). The hyperthermophilic mem- 
ber of the Crenarchaeota, Staphylothermus marinus, 
provides an interesting example of the general archi- 
tecture of archaeal S layers, although it is an unusual 
type. The scaffold with a tetragonal symmetry is 
formed by an extended a484-glycoprotein termed 
tetrabrachion (Fig. 5), which forms a long stalk (a- 
subunit, coiled coil) and a canopy-like network (aB- 
subunits) enclosing a “quasi-periplasmic space” (302, 
303). The subunits are generated by proteolytic cleav- 
age of a single gene product. A subtilisin-type pro- 
tease is attached in the middle of the stalks, consistent 
with the growth of S. marinus at the expense of pep- 
tides and proteins. 

The S layers of most Archaea enclose a “quasi- 
periplasmic space” with variable thickness and with 
pores of varying sizes that provide room for hy- 
drolytic enzymes (e.g., Staphylothermus; Fig. 5), sub- 
strate-binding proteins, electron transport compo- 
nents, and glycosyl residues of membrane proteins. It 
has been speculated that the space might be packed 
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with proteins and provide a specific extracellular en- 
vironment close to the membrane. This could be par- 
ticularly important for the “hyperacidophiles,” which 
maintain steep pH gradients across the membrane, 
and for the protection of proteins with metallic com- 
ponents, such as iron-sulfur clusters or molybdenum 
cofactors. Thus, the “quasi-periplasmic space” may 
provide a pH gradient buffering capacity to the other- 
wise harsh external conditions. 

Many archaea, including the cell wall-less Ther- 
moplasma and Ferroplasma, are flagellated (Fig. 6). 
The flagella seem unrelated to their bacterial counter- 
parts in terms of amino acid composition, structure, 
and morphogenesis, although their mechanics and 
overall function are similar (polar or peritrichous fla- 
gellation, formation of bundles, and reversible rota- 
tion) (see Chapter 18). Archaeal flagellar filaments 
share some features with bacterial type IV pili but are 
unique in other aspects (18). Archaeal flagella have a 
smaller diameter than bacterial flagella and a smooth 
surface, but the diameter is larger than type IV pili. 
In contrast to type IV pili, they are synthesized with 
a signal peptide. They are glycosylated and sulfated in 
a similar way to S-layer proteins. In Methanococcus 
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Figure 5. The Staphylothermus marinus S-layer protein tetrabrachion. (A) Electron micrograph of the negatively stained tetra- 
brachion-protease complex. (B) Schematic model of the complex with dimensions. (C) Schematic model of the cell surface of 
Staphylothermus. (D) Proposed folding topology with cysteine residues and the unique proline residue separating left- and 
right-handed supercoils. Figure compiled and reproduced from Current Biology (259) and the Journal of Molecular Biology 


(302) with permission of the publishers. 
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Figure 6. Flagellated archaea. (A) Thermococcus celer. Reproduced from Bergey’s Manual of Systematic Bacteriology (208) 
with permission of the publisher. (B) Thermoplasma acidophilum. Reproduced from the Journal of Bacteriology (31) with 
permission of the publisher. (C) Halobacterium salinarum PHH4. Photograph courtesy of T. Hechler, Darmstadt. 


voltae the flagellum is composed of four proteins: 
FlaA, FlaB1, FlaB2, and FlaB3. The attached glycan is 
a novel N-linked trisaccharide composed of modified 
mannose and glucose residues linked to asparagine 
residues. The same trisaccharide was identified on a 
tryptic peptide of the S-layer protein from M. voltae, 
implicating a common glycosylation pathway in fla- 
gella and S-layer modification (417). 

Flagella biosynthesis, motor structures, and 
switching of flagellar rotation has been studied in de- 
tail in Halobacterium salinarum (see Chapter 18). 
Similar to Bacteria, sensory transducers trigger flagel- 
lar switching (see Chapter 11). A novel transducer 
(MpcT) that responds to membrane potential (AW) 
has been characterized in H. salinarum (209) (see 
“Bacteriorhodopsins and light-driven ATP synthesis” 
below). The structure and function of the flagella mo- 
tor of gram-negative bacteria has been well studied, 
in contrast to the archaeal motor where anchoring in 
the cytoplasmic membrane and S layer is poorly un- 
derstood. Fla gene clusters have been identified in sev- 
eral archaeal genome sequences, but homologs of mo- 
tor proteins have not (18). 

Pili or piluslike structures were observed in many 
archaea (e.g., 155), and conjugative plasmids have 
been identified (332, 352) (see Chapter 5). The struc- 
ture and molecular characteristics of the pilus from 
the SM1 euryarchaeon, referred to as “hami,” was 
analyzed in detail. It is one of the most exciting new 
extracellular structures recently identified in Archaea 
(Fig. 7) (265-267). SM1 grows in cold sulfurous 
springs in southern Germany in assemblages reminis- 
cent of “strings of pearls” (see Table 7 below). The 
inner part of each pearl consists of ~107 cells of SM1, 


while the exterior and the filament connectors are 
formed either by a Thiothrix or an e-proteobacterium 
(IMB1) (335). SM1 can be grown in situ on polyeth- 
ylene fabrics to form high-density biofilms but has not 
been cultivated in the laboratory. The archaeal cells 
maintain a distance of approximately 2 to 4 um from 
each other and are connected by a thick web of ap- 
pendages (Fig. 7). High-resolution electron micro- 
scopic pictures showed that the hami are made of a 
regular structure with a defined base and tip. The cen- 
tral region, up to 2 wm in length, is made of up to 60 
repeating units of a 46 X 4 nm, elongated coiled 
trimer that consists of a 40-kDa filamentous protein 
that is 4 nm in diameter. The trimer is stable against 
physical and chemical denaturation, and the amino 
acid sequence of the protein is not similar to any se- 
quences in GenBank. The protein trimer sitting at the 
tip of the filament is partially uncoiled and bent back 
toward the cell forming a structure reminiscent of a 
three-pointed grappling or fishing hook. 


Chromosomes, DNA Structure, and Replication 


Archaeal chromosomes are circular, similar to 
those of most Bacteria, with sizes typically between 
~1.5 and 6 Mbp (e.g., 5.75 Mbp in Methanosarcina 
acetivorans). Methanosarcina species have relatively 
large genomes and are the only Archaea thought to 
have acquired large numbers of genes from other 
species (gene gatherers), similar to the bacterial strep- 
tomycetes and rhizobia that have genomes up to 
10 Mbp in size; the latter appear to have acquired nu- 
merous genes for transporters, two-component sys- 
tems, polymer hydrolases, and oxygenases for the 
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Figure 7. See the separate color insert for the color version of this illustration.) The euryarchaeon SM1 and its extracellular appendages (“hami”). (A) Elec- 
tron micrograph of a “hamus.” (B) Enlargement of the hook region. (C) Simplified model of a hamus with the three filaments shown in different colors and 3D 
reconstruction from cryoelectron microscopy. (D) “String of pearls,” archaeal/bacterial community in cold, sulfurous spring water. (E) Hamus model with 
dimensions. (F) Natural biofilm hybridized with an SM1-specific fluorescent probe; circle diameter, 4 um. (G) Pt-shadowed electron micrograph of a single 
SM1 cell with appendages. Figure compiled, modified, and reproduced from Biospektrum (264) and Molecular Microbiology (265) with permission of the 
publishers. 
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degradation of unknown substrates (106) (see Chap- 
ters 13 and 19). In contrast, most chemolithotrophs 
have small genomes between 1.5 and 2 Mbp. The 
smallest genome (481 kbp) was found in the symbi- 
otic archaeon, N. equitans (see “Phylogeny of Ar- 
chaea and the origin of life,” above). Gene organiza- 
tion (e.g., clustering in putative operons) and coding 
density (75 to 90% gene encoding) in Archaea are 
similar to the Bacteria (e.g., 105). Some archaea oc- 
casionally have self-splicing group II introns (138, 
198) but group I introns have not been found (see 
Chapter 7). 

Chromatin proteins are used to organize DNA in 
the cell and to prevent unspecific condensation (see 
Chapter 4). Distinct families of small, basic, and abun- 
dant chromatin proteins are present in all three do- 
mains of life (reviewed recently in reference 340). Ar- 
chaeal chromatin proteins fall into two families, 
histones and Alba proteins. Both have homologs in 
Eucarya. Histones have been well characterized in 
Euryarchaeota (340), and have more recently been 
identified in Crenarchaeota (e.g., Sargasso sea envi- 
ronmental genome scaffolds and the genome of Ce- 
narchaeum symbiosum) (60). Similar to those from 
Euryarchaeota, histones from Crenarchaeota form 
dimers in solution. They oligomerize further in con- 
tact with DNA and form compact structures that re- 
semble eucaryal nucleosomes. In contrast, the role of 
the Alba proteins is less well defined as they bind both 
double-stranded DNA (dsDNA) and single-stranded 
RNA (ssRNA). Unlike Eucarya, there is no consensus 
model for archaeal chromatin as each organism has 
a characteristic set of chromatin proteins with unique 
regulatory features. Nevertheless, a good understand- 
ing is emerging of how chromatin proteins maintain a 
compact and structured chromosome that is still ac- 
cessible to gene expression complexes (340). Histones 
may also contribute to the stabilization of DNA 
against thermally induced strand separation in the hy- 
perthermophilic archaea (274) (see Chapter 19). The 
assumption that the G+C content is a determinant 
of DNA thermostability in hyperthermophiles is 
wrong. Hyperthermophilic archaea have a G+C con- 
tent between ~35% (e.g., Sulfolobus) and =60% 
(e.g., Pyrodictium), which is similar to the range of 
G+C content in mesophiles. The only and notable ex- 
ceptions are rRNA genes (274, 348). 

Insight into mechanisms of DNA replication in 
Archaea has progressed very rapidly in the past few 
years (see Chapter 3). Most archaeal replication pro- 
teins are more similar to those found in Eucarya than 
to the analogous proteins in Bacteria (187). For ex- 
ample, Archaea use ATP-dependent DNA ligases, 
similar to Eucarya, as opposed to NAD-dependent 
and phylogenetically unrelated bacterial enzymes 


(206). However, Archaea only possess a subset of the 
eucaryal machinery and possess features of replica- 
tion that are not found in other organisms. Pyrococ- 
cus abyssi was the first archaeon for which a single 
origin of replication was predicted by skew analysis 
and subsequently confirmed experimentally (273) (see 
Chapters 3 and 19). In silico analyses suggested that 
other archaeal species might contain more than one 
origin, and this was recently demonstrated for Sul- 
folobus solfataricus and Sulfolobus acidocaldarius 
(246, 329). Most of the replication proteins have 
been biochemically examined (Table 1). 


Transcription 


The first indications that the transcription ma- 
chinery and most of the information-processing ma- 
chinery in Archaea are more closely related to Eu- 
carya than to Bacteria came from in vitro inhibitor 
studies (see Chapter 6). The Sulfolobus and Halobac- 
terium DNA-dependent RNA polymerases (RNAPs) 
were found to be resistant to rifampicin, streptolydi- 
gin, and a-amanitin, similar to eucaryal RNAPs (453, 
454). The sequences of archaeal RNAPs resemble eu- 
caryal RNAP II (and III) and consist of up to 13 dif- 
ferent subunits (Fig. 2) (25, 449, 453). The B subunit 
is homologous to the B (second largest) subunit of the 
eucaryal nuclear polymerases I, II, and III and to the 
bacterial B-subunit (Fig. 2). The combined A and 
C subunits in Archaea correspond to the largest A 
subunits of the eucaryal RNAPs and to the bacterial 
B’. Because the RNAP is composed of highly con- 
served and variable regions it has been useful for phy- 
logenetic analyses (see Chapter 19). Moreover, the 
combined size of the RNAP genes (~8,500 bp) pro- 
vides a large dataset (~2,000 conserved amino acids), 
which may provide a better estimate of cellular evo- 
lution than rDNA sequences (203). 

Similar to Eucarya, the typical archaeal pro- 
moter contains an AT-rich TATA box ~22 to 28 bp 
upstream of the transcription initiation site (322, 
323) (see Chapter 6). In addition, a TATA box-bind- 
ing factor (TBP) and transcription factor B (TFB) par- 
ticipate in promoter recognition and direct the RNAP 
promoter binding (21, 25, 26, 320, 323). These tran- 
scription factors are sufficient for initiating transcrip- 
tion, whereas additional factors are required in 
Eucarya (21). In contrast, most of the regulatory pro- 
teins and the mechanism of gene regulation are more 
similar to Bacteria (24). Transcription termination 
seems to involve pyrimidine or T-rich stretches; for 
example, 3’ termini have been mapped in vivo to a lo- 
cation immediately downstream of a TTTTTYT se- 
quence that was part of a pyrimidine-rich region of 
16 to 19 nt (321, 431). In Methanothermobacter 
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thermautotrophicus, RNAP was found to terminate 
in vitro at intergenic sequences similar to bacterial 
terminators, and stable RNA hairpins did not appear 
to be important (341). 


Posttranscriptional Gene Silencing 


The mechanism of RNA interference (RNAi), 
also termed posttranscriptional gene silencing, was 
one of the most exciting discoveries of the past decade 
in the field of gene regulation in Eucarya (241). Many 
of the proteins responsible for RNAi have recently 
been identified in Archaea and Bacteria and have be- 
come excellent models for studying the structure, 
function, and specificity of RNAi proteins (247, 295, 
296, 374, 442). In RNAi, an RNase I-like enzyme 
(Dicer) cleaves dsRNA into short (21 to 23 nt) RNA 
duplexes, referred to as guide RNAs or siRNAs, 
which have 2-nt overhangs at the 3’ ends and 5'- 
phosphates (134). One of the strands is incorporated 
into the RNA-induced silencing complex (RISC), 
which then targets a complementary mRNA. A mini- 
mal RISC consists of the guide RNA plus a single pro- 
tein called argonaute (Ago). The protein mediates 
duplex formation and subsequently cleaves the com- 
plementary target mRNA via an endonuclease do- 
main, termed slicer or PIWI. The mRNA is further de- 
graded by exonucleases, and gene expression is thus 
silenced. Silencing is usually long lasting and stable, 
which makes RNAi an ideal tool for studying gene 
function in Eucarya. A related pathway utilizes micro 
RNAs (miRNAs) that are known to regulate transla- 
tion of an estimated 20% of all human genes. Micro 
RNAs are transcribed as precursors with stable hair- 
pin structures that are processed by the Dicer enzymes 
into short imperfect duplex structures resembling 
siRNA (432). Processed miRNAs are then integrated 
into RISCs that target the 3'-UTR of many mRNA 
transcripts and interfere with translation initiation. 

Dicer and Ago are both multidomain proteins 
(Fig. 8) with multiple functions (241). It has been dif- 
ficult to dissect their structure and activities. Se- 
quences with similarity to the PIWI domain from Ago 
are present in many archaea and in bacteria. Full- 
length Ago is also present in several Euryarchaeota, 
but the degree of sequence conservation is low even 
within this archaeal kingdom. For example, Ago from 
Pyrococcus furiosus and Methanocaldococcus jan- 
naschii share only 27% amino acid identity. The re- 
cently solved X-ray structures of Ago from Archaeo- 
globus fulgidus, P. furiosus, and the bacterium Aquifex 
aeolicus revealed that the PIWI domain has structural 
similarity to RNase H (Fig. 8) (247, 295, 296, 374, 
442). Mutations within the RNase H domain of hu- 
man Ago 2 inactivate RISC and demonstrate that this 


domain is responsible for the nuclease activity (242, 
327). Guided by the structure of the P. furiosus en- 
zyme (374), it has also been demonstrated that the 
human Ago 2 enzyme possesses the Slicer activity 
of RISC. 

The mammalian Dicer is a large protein with, 
among other domains, a DEAD box RNA helicase, a 
PAZ, and two RNase domains (Fig. 8) (241). The he- 
licase and RNase domains have homologs in bacterial 
and archaeal genomes: the P. furiosus Hef helicase 
and Aq. aeolicus double strand-specific RNase III, re- 
spectively, have low sequence similarity (A. Kletzin, 
unpublished results). The two archaeal enzymes are 
presently the only members of their respective families 
that have three-dimensional (3D) structures available 
(108, 210, 280). Thus, the eucaryal RNAi enzymes 
dicer and slicer seem to be assemblies of several bac- 
terial and archaeal precursor domains, which, in com- 
bination in the eucaryal enzymes, allows recognition 
and targeting of specific mRNAs for degradation. 

The biological importance of cellular posttran- 
scriptional gene silencing, in particular, in Bacteria 
and Archaea, is not fully understood. It could func- 
tion as an evolutionarily conserved endogenous de- 
fense to the presence of double-stranded RNA, and 
thus a defense mechanism against invading viral 
DNA, RNA, (retro-) transposons, and other heterol- 
ogous double-stranded genetic elements that may po- 
tentially harm the genetic integrity of the host (56). 
Gene silencing may also play a role as a regulatory 
mechanism involving the synthesis of antisense RNA, 
a phenomenon that has only been shown to occur in 
very few archaeal systems. For example, RNA-me- 
diated gene regulation has been demonstrated for the 
immunity transcript of the haloarchaeal phage PH 
(392, 393). The processing activity is not sequence 
specific but depends on the presence of an RNA du- 
plex. RNase activity was demonstrated in this sys- 
tem, but the enzyme responsible was not identified. 
These observations and the growing number of small 
noncoding RNAs (see Chapter 7) point to a poten- 
tially widespread role of RNAi-like mechanisms 
in Archaea. 


Translation 


Superficially, the archaeal translation machinery 
is similar to the bacterial machinery. Common fea- 
tures are 70S ribosomes with 30S and 50S subunits, 
similar-length ribosomal RNAs, transcriptional and 
translational coupling, the presence of polycistronic 
messages, and a frequent but nonexclusive use of 
Shine-Dalgarno sequences as ribosomal binding sites 
(reviewed in reference 244) (see Chapter 8). However, 
the archaeal translation system includes numerous 
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Figure 8. (See the separate color insert for the color version of this illustration.) Three-dimensional structures of archaeal Argo- 
naute proteins. (A) 3D structure of P. furiosus Ago with the PAZ domain (blue) and the PIWI domain (green/yellow) (PDB 
code 1U04). (B, C) Similarity of the P. furiosus PIWI domain (B) with the catalytic core of the E. coli RNase H1 (C) (PDB code 
1RDD) with the catalytic DDE triad and bound Mg?* ion highlighted. The P. furiosus PIWI domain has a putative, similar 
catalytic DDE triad and a conserved Arg (position 627). (D, E) 3D structure of the P. furiosus PAZ domain (D) and compari- 
son with the homologous domain of human Ago1 bound to an siRNA mimic (E) (PDB code 1SI3). (F) Domain structures of Ago 
proteins, including N-terminal, linker (L1 and L2), PAZ, Mid, and PIWI domains and of human Dicer comprising a DEXH he- 
licase, a PAZ, two RNase III, and dsRBD domains and a conserved domain of unknown function (DUF). Panels A to E repro- 
duced with modifications from Current Opinion in Structural Biology (241) with permission of the publisher. 


specific features not present in Bacteria, some of 
which are specific to Archaea and others that are sim- 
ilar to Eucarya. For example, almost all antibiotics 
inhibiting bacterial translation are ineffective in the 
Archaea (and the Eucarya) and vice versa, regardless 


of whether they affect the 30S or 50S subunits. Bac- 
teria use N-formyl-methionyl-tRNA for translational 
start codons, while Archaea and Eucarya use unmod- 
ified methionine. Approximately 21 genes encoding 
putative translation initiation factors have been iden- 
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tified in archaeal genomes, most of which have ho- 
mologs in the Eucarya. 

A feature that is common to Archaea and Eu- 
carya and that is therefore diagnostic for discriminat- 
ing Archaea from Bacteria is the finding that that 
elongation factor 2 (EF-2) is ADP-ribosylated by 
diphtheria toxin (193) (see Chapter 8). The archaeal 
and eucaryal EF-2s contain diphthamide, a posttrans- 
lationally modified histidine, which is the target of the 
toxin (Fig. 9) (175) (see Chapter 11). A short region 
that contains a histidine residue in EF-2 proteins is 
absent in the homologous EF-G protein from Bacte- 
ria, thereby rendering Bacteria resistant to the toxin 
(232). The in vivo role of diphthamide is not clear. 
Sulfolobus and yeast EF-2 overproduced in Es- 
cherichia coli contain the unmodified histidine, and 
these recombinant proteins had indistinguishable sta- 
bility and in vitro translation activity from the diph- 
thamide-containing native enzymes (68). This is an 
area open for investigation, because not all of the 
genes required for biosynthesis of diphthamide have 
been identified in Eucarya and only one of them, 
DHP2, has been identified in archaeal genomes (243). 
Moreover, diphthamide biosynthesis has not been ex- 
perimentally examined in Archaea. 

The translation initiation factor IF-5D contains 
hypusine in Crenarchaeota and deoxyhypusine in 
Euryarchaeota (Fig. 9) (20, 362) (see Chapters 8, 9, 
and 11). The lysine derivative hypusine (N*-[4- 
amino-2-hydroxybuty]]-l-lysine) is synthesized in a 
two-step reaction catalyzed by the enzymes deoxyhy- 
pusine synthase and deoxyhypusine hydroxylase 


i 
¢—HN——-CH—-C——_N ¢—N——CH—C—_-N 
Oo 
H2 CHe 
(Histidine) CHo 


HaC — N= C HC——OH* 
he o CH2 
NH2 CH, 
Diphthamide NH, Hypusine 


Figure 9. Chemical structures of diphthamide and hypusine. Mod- 
ifications (bold type) of the amino acids histidine and lysine to form 
diphthamide and hypusine, respectively. ADP ribosylation site cat- 
alyzed by diphtheria toxin (arrow). 


(294). The synthase but not the hydroxylase gene has 
been identified in the Eucarya and Archaea (43). 
Crystal structures of IF-5D have been obtained from 
three hyperthermophilic archaea, but equivalent struc- 
tures are not available from Eucarya (196, 298, 439). 
The structures reveal that the hypusine is positioned 
at the tip of an extended and exposed loop of con- 
served amino acid residues, which appears to mimic 
the anticodon loop of a tRNA molecule. Elongation 
factor P (EF-P) from Bacteria has low sequence but 
high structural similarity to archaeal IF-5D and con- 
tains a conserved and unmodified lysine residue at the 
equivalent position (43, 439). The EF-P appears to 
stimulate peptidyltransferase activity, whereas the 
role of IF-5D has not been clearly defined. Condi- 
tional IF-SD mutants in yeast exhibit cell cycle arrest 
(293), and a similar effect was observed in Archaea 
by using the deoxyhypusine synthase inhibitor Nt- 
guanyl-1,7-diaminoheptane (166). These data suggest 
that the initiation factor might be important for the 
translation of cell cycle proteins. 

One of the outstanding advances in structural bi- 
ology and in translation research was obtaining the 
3D structure of the 70S ribosome. Ribosomes of two 
species, the archaeon Haloarcula marismortui and the 
thermophilic bacterium Thermus thermophilus were 
resolved to 2.4 and ~3 A, respectively (Fig. 10). In 
addition, functional complexes of tRNA molecules, 
initiation and elongation factors have been analyzed 
by cryogenic microscopy reconstruction studies 
guided by the 3D model from X-ray crystallography 
(282, 384, 441). Not only the initiation and elonga- 
tion reactions have been studied in Archaea, but also 
cytosolic factors required for the exit of the nascent 
polypeptide chain from the ribosome. The 3D struc- 
ture of a homodimeric “nascent polypeptide-associ- 
ated complex” previously identified in Eucarya and 
conserved in Bacteria and Archaea has recently been 
solved from Methanothermobacter marburgensis 
(377). The protein is associated with ribosomes and 
seems to have a role in cellular protein quality control. 

mRNA recognition and initiation of translation 
in Archaea appear to occur by more than one mech- 
anism (244) (see Chapter 8). Ribosome-binding sites 
(RBSs) are present upstream of start codons of many 
genes (244). However, other mRNAs have an ex- 
tremely short or absent 5'-untranslated leader, pre- 
venting the use of an RBS for ORF recognition (29, 
131, 372). In vivo studies using reporter genes in 
Haloferax volcanii have shown that the leaderless 
transcripts often show the strongest translation lev- 
els, whereas leader sequences seem to act in down- 
regulation of the translation activities and fine tuning 
of gene expression (345). Contrasting results were ob- 
tained with a Sulfolobus cell-free in vitro translation 
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Figure 10. (See the separate color insert for the color version of this illustration.) Three-dimensional structure of the 50S sub- 
unit of the Haloarcula marismortui ribosome. The ribosome arm around ribosomal protein L1 was omitted (for a more com- 
plete picture see reference 200). Figure drawn from the coordinates from PDB entry 1QVF (200); ribosomal RNAs are dis- 
played in red (backbone) and gray (bases), proteins are displayed as yellow backbone ribbons. Top left, crown view; top 
right, back view; bottom, bottom view; the circle indicates the position of the polypeptide exit tunnel. 


system, which resulted in decreased translation levels 
with leaderless transcripts (57). The percentage of 
leaderless transcripts seems to vary within the Ar- 
chaea. Ten transcripts were mapped in Pyrobaculum 
aerophilum, and all were leaderless, and it was pro- 
posed that all transcripts in this organism should be 
leaderless (372). Similar bioinformatic predictions 
were made for other archaea (424), but few experi- 
mental data are available (131). 

Archaeal aminoacyl-tRNA synthesis displays a 
significant level of divergence from the E. coli model 
system (see Chapter 9). Novel archaeal mechanisms 
have been described for the aminoacylation of tRNAs 
with Asp, Cys, Glu, and Lys, some of which have 
been subsequently found in the Bacteria (310). Ar- 


chaea incorporate the 21st amino acid, selenocys- 
teine, in response to UGA stop codons and a cis-act- 
ing selenocysteine insertion sequence element (SECIS) 
in the mRNA, which forms a short stem-loop struc- 
ture in the RNA. The 22nd amino acid, pyrrolysine, 
was detected recently in five methanogenic archaea 
and in one bacterium. Pyrrolysine is inserted in re- 
sponse to UAG stop codons and requires a specific, 
but as yet undefined, sequence element (PYLIS) (444) 
(see Chapter 9). 

A novel method of recoding nonsense codons for 
the incorporation of rare amino acids into foreign 
proteins has been described for E. coli, yeast, and 
mammalian cells (437), that relies on the expression 
of archaeal aminoacyl-tRNA synthetases in host cells. 
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In addition, many archaeal tRNAs cannot be charged 
by host synthetases. The tRNA and corresponding 
synthetase from M. jannaschii was expressed and 
coupled to the in vivo incorporation of the amino 
acid O-methyl-L-tyrosine into the dihydrofolate re- 
ductase in E. coli encoded by an amber nonsense 
codon (425). To overcome the restrictions of the 
triplet code, recoding was successful using a Pyrococ- 
cus horikoshii type I lysyl-tRNA synthetase, which al- 
lowed the site-specific incorporation of L-homogluta- 
mine in response to a quadruplet codon, AGGA. The 
novel codon was resistant to frame-shift suppression. 
This system also permitted the simultaneous incor- 
poration of a second unnatural amino acid at distinct 
positions in myoglobin (8). More than 60 unnatural 
amino acids have been translationally incorporated 
into proteins. Many provide side chains with functional 
groups poised for covalent linkage with dyes and other 
markers, such as spin labels. In the future, selenocys- 
teine may also be translationally incorporated into pro- 
teins by site-directed mutagenesis without the need for 
extended RNA secondary structures. This would 
greatly expand the feasibility of performing spectro- 
scopic studies on cysteine-liganded metal centers. 


ARCHAEAL MODEL ORGANISMS 
AND IMPORTANT TAXA 


An ideal model organism may be considered as 
one that can reproduce and grow quickly, be easily 
handled in the laboratory, and possess characteristics 
analogous to those of many other organisms. No such 
organism is found in the microbial world because mi- 
croorganisms exist in markedly different habitats and 
exhibit enormous physiological diversity. At present, 
there is not a single archaeon for which the genetic, 
physiological, and biochemical tools have been devel- 
oped to an extent that is similar to E. coli. Even if 
such an organism does emerge, it can only be a model 
for a limited range of physiological traits and repre- 
sent a distinct component of archaeal diversity. 

Rather than one model organism, a broad range 
of archaea have proven useful for studying morphol- 
ogy, physiology, molecular mechanisms of adapta- 
tion, and so forth. These include Methanothermobac- 
ter thermautotrophicus, M. marburgensis, and Metha- 
nosarcina spp. for methanogenesis; Thermoplasma 
for proteolysis; Halobacterium for light-driven pro- 
ton translocation, gene regulation, chemotaxis, and 
gas vesicles synthesis; Archaeoglobus for sulfate re- 
duction; Pyrococcus and Acidianus ambivalens for 
inorganic sulfur metabolism; Sulfolobus, A. am- 
bivalens, Pyrococcus, and the methanogens for elec- 
tron transport chains; Sulfolobus and Pyrococcus for 


DNA replication and transcription; Halobacterium, 
Haloarcula, and Sulfolobus for translation, and the 
list continues. 

There are no model organisms representing ar- 
chaeal pathogens for the simple reason that there is 
not a single pathogen known, despite the fact that Ar- 
chaea are natural components of the microbiota of 
many, if not all, multicellular animals, especially her- 
bivores. It would be premature to assume that ar- 
chaeal pathogens do not exist because they have not 
been identified (52). Archaea are known to excrete 
proteases and cellulases, and some can colonize meta- 
zoan hosts. Therefore, there is no fundamental bar- 
rier for the pathogenic potential toward animals or 
plants. However, even with the use of modern mole- 
cular biology techniques (i.e., not just relying on cul- 
tivation), no archaeon has been clearly linked as the 
causative agent of disease. One recent report de- 
scribes Methanobrevibacter oralis as a potential 
cause of endodontic infections, but others were un- 
able to find a correlation between this methanogen 
and the disease (370, 415). Even the growth of mem- 
bers of the Halobacteriales on dried salted fish may 
present only aesthetic issues as the Archaea do not 
cause illness. As a consequence, Archaea can be re- 
garded as nonpathogenic. An interesting conse- 
quence of this is that pathogen-related biosafety is- 
sues do not hinder laboratory work. 

In the remaining sections of this chapter, the 
properties of major archaeal taxa are described ac- 
cording to their ecology and molecular similarity. Im- 
portant characteristics of some of the key organisms 
are included. 


Halophilic Archaea 
Halophilic microorganisms and their habitats 


Halophilic microorganisms and, in particular, 
halophilic archaea have developed mechanisms of os- 
moprotection that allow them to thrive in habitats 
containing high levels of various ions (up to satura- 
tion, e.g., 5.2 M NaCl) (291). Hypersaline habitats 
are widespread around the world and can be differ- 
entiated due to their genesis. Thalassohaline waters 
are dominated by NaCl and typically arise from sea- 
water evaporation. These include natural or man- 
made environments such as crystallizer ponds of solar 
salterns, salt mine drainage waters, coastal splash 
zones, tide pools, and brine springs from under- 
ground salt deposits. Athalassohaline waters origi- 
nate from inland surface water that leaves behind 
high concentrations of divalent cations such as Mg?* 
(Dead Sea, Israel) or carbonate (122). Hypersaline 
environments vary in pH from near neutral (e.g., the 
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Dead Sea, Israel; pH ~ 6) to alkaline, carbonate-rich 
waters (e.g., Lake Magadi, Kenya, and Big Soda Lake, 
Nevada; pH ~ 9 to 10) (122, 289). 

A large diversity of microorganisms of all three 
domains occupy both thalassohaline and athalasso- 
haline habitats. These ecosystems can be very pro- 
ductive in terms of CO, fixation and cell densities. 
The most important primary producers are green al- 
gae (Dunaniella) and anoxygenic phototrophic bacte- 
ria such as Ectothiorhodospira and Halorhodospira 
(300). In contrast, extremely halophilic archaea of the 
order Halobacteriales are chiefly aerobic chemo- 
organoheterotrophs, growing with amino acids, pep- 
tides, organic acids, and carbohydrates. A high per- 
centage of these grow facultatively anaerobic, either 
by respiration of nitrate, DMSO, and trimethylamine, 
and/or by fermentation (291). Novel strains were re- 
cently isolated that could grow on substrates other 
than complex organic nutrients, chemolithoauto- 
trophically, or by anaerobic respiration with terminal 
electron acceptors such as sulfate, fumarate, or sulfur 
(83, 375). H. salinarum is well known for its capacity 
to grow photoorganoheterotrophically under low 
oxygen tension using the retinal protein BR as a light- 
driven proton pump (286). This property is restricted 
to a small number of Halobacteriales species. 

Many types of bacterial metabolism have not 
been found in extreme halophiles. This might be due 
to incomplete sampling or difficulties in cultivation. 
However, bioenergetic constraints may also limit cer- 
tain types of chemolithotrophy and anaerobic respi- 
ration to habitats with lower salt concentrations. For 
example, free-energy calculations show that chemo- 
lithoautotrophic growth of halophilic methanogens 
on CO//H;,, or heterotrophic growth on acetate should 
not be possible at salinities exceeding 1 M (291). Ex- 
tremely halophilic methanogens have been isolated on 
substrates including trimethylamine (e.g., Methano- 
halobium evestigatum) (446). Neither Bacteria nor 
Archaea have been cultivated that (i) oxidize their 
substrates completely, (ii) grow at salinities above 
15%, and (iii) gain cellular energy by autotrophic am- 
monia or nitrite oxidation, by the anammox reaction, 
or by sulfate reduction (291). 


Adaptation to hyperosmolarity 


Maintaining an osmotic balance and protection 
of proteins from high salt concentrations are two as- 
pects important for cellular life in high-salt environ- 
ments. Cells typically have to maintain a lower water 
activity of the cytoplasm than the surrounding brine 
to maintain sufficient turgor pressure (123). Halo- 
philic archaea are exceptional in keeping the internal 
salt concentrations isoosmotic to the environment. 


Two strategies exist within the microbial world to 
cope with the osmotic stress: 


1. A strategy based on compatible solutes re- 
quires the accumulation of small organic molecules 
such as glycine-betaine, ectoine, polyols, sugars, or 
amino acids to balance the external salt concentra- 
tion, while maintaining a low internal salt concentra- 
tion. In this case, salt adaptation of cytoplasmic pro- 
teins is not required. Compatible solutes are used by 
extremely halophilic methanogenic archaea and most 
halophilic bacteria, whereas it is uncommon among 
Halobacteriales. A single report documents that some 
haloalkaliphilic strains accumulate 2-sulfotrehalose 
as a compatible solute when grown under low-nutri- 
ent conditions (67). 

2. The archaea in the order Halobacteriales, the 
halophilic anaerobic bacteria of the order Haloanaer- 
obiales, and Salinibacter use a “salt-in” strategy (269, 
292). KCl and NaCl are accumulated in the cyto- 
plasm to a concentration osmotically equivalent to 
the external medium (e.g., 4.2 M KCI plus 1 M NaCl 
in H. salinarum) (123). Cytoplasmic proteins need to 
be protected from precipitation or denaturation due 
to dehydration. This is accomplished by an adaptive 
change in the amino acid composition. Most proteins 
from haloarchaea (and also from Salinibacter) con- 
tain a small proportion of hydrophobic residues and 
an excess of acidic amino acids. The acidic amino 
acids are located predominantly on the surface of 
haloarchaeal proteins attracting cations in sufficient 
amounts to preserve a hydrated shell and to enhance 
solubility. As a consequence, most haloarchaeal pro- 
teins have an acidic pI (4 to 5) and depend on high- 
salt concentrations for function, in vivo and in vitro, 
although the salt can be replaced for in vitro experi- 
ments by neutral osmoprotectants such as sugar al- 
cohols or glycerol. The ion balance is maintained by 
Na* transport from the cytoplasm by an Na*/H* an- 
tiporter and a K* symporter, which are both driven 
by the proton motive force (123). 


The dependence of haloarchaeal proteins on high salt 
concentrations in general renders protein structure 
studies difficult; however, two structures of general im- 
portance have been determined. The ribosome of H. 
marismortui was specifically selected for structural 
studies because the salt masks negative charges of 
the ribosomal RNA. The structure has been refined to 
2.4 A resolution, and many cocrystallizations with 
bound antibiotics have been successful (Fig. 10). The 
other extremely well studied haloarchaeal protein, for 
which there is an X-ray structure, is bacteriorhodopsin 
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and its relatives, the halorhodopsins and sensory rho- 
dopsins (see “Bacteriorhodopsins and light-driven ATP 
synthesis” and Fig. 13 below). In contrast, only eight 
other X-ray and 2 NMR structures have been solved 
from the Halobacteriales. These structures are for 
malate dehydrogenase, dihydrofolate reductase, do- 
decin (a dodecameric protein of unknown function), 
DpsA (an icosatetrameric protein involved in iron 
homeostasis), a nucleotide diphosphate kinase, a cata- 
lase peroxidase, an a-amylase, and a [2Fe-2S] ferre- 
doxin (structures compiled from the structures 
browser at http://www.ncbi.nih.gov). To put this num- 
ber in perspective, structures of more than 50 differ- 
ent proteins have been generated from Pc. furiosus, 
many of which did not come from the structural ge- 
nomics project (see Chapter 20). 


Halobacteriales 


Typical habitats of Halobacteriales are salt lakes 
such as the Dead Sea, the salt lake of the East African 
rift valley, and man-made habitats, including solar 
salterns, which consist of a chain of crystallizer ponds 
with successively increasing salt concentrations. 
While cell counts of bacterial and eucaryal species de- 
crease with increasing salt concentration, haloarchaea 
become more predominant and densities may exceed 
1 x 10? ml~!. The fascinating reddish color of solar 
salterns results from orange and red bacterioruberins 
(primarily Cs, carotenoids) in the haloarchaeal mem- 
branes, whose primary function is the protection of the 
cells against photo-oxic damage caused by exposure 
to the intense sunlight of the Earth’s lower latitudes 
(Fig. 11). The pink or violet retinal-based pigments 
bacteriorhodopsin and halorhodopsin contribute to 
the coloration. Colonies of most Halobacteriales on 
agar plates are red or reddish in color. Haloarchaea are 
also found, inoculated with the added salt in hides, soy 
sauce, and dried fish (e.g., H. salinarum). 

An exciting discovery was the detection of viable 
haloarchaea in fossil halite (NaCl) deposits (reviewed 
in reference 382), some of which have been cultivated 
(Table 2) (381). Massive sedimentation of halite and 
other salt deposits occurred during several periods in 
the Earth’s history, including the late Permian and early 
Triassic, ~245 to 280 million years ago (445). Some 
of the rock salts contain pink crystals from haloarchaea 
(Fig. 11). It is assumed that haloarchaea were en- 
trapped in the halite upon evaporation, most likely in 
microinclusions of saturated brines in the salt (123). 
The mechanisms of persistence remain speculative, and 
it has been debated whether the haloarchaea were en- 
trapped at the time of the rock salt formation or 
whether they invaded the salt in more recent times 
through the action of percolating ground water (260). 


Figure 11. (See the separate color insert for the color version of this 
illustration.) Haloarchaea in liquid cultures and within salt crys- 
tals. (A) Cultures of Haloferax and Halorubrum: first flask (front), 
H. volcanii WFD11 wild type; second flask, H. volcanii WFD11 
gas vesicle AD mutant (see Fig. 14); third flask, H. volcanii WFD11 
gas vesicle AD mutant complemented with the gvpD gene; fourth 
flask, Halorubrum vacuolatum wild type. (B) Himalayan rock salt 
(“Eubiona”; Claus, GmbH, Baden-Baden, Germany). (C, D) Crys- 
tals formed from dried Halobacterium cultures (cells trapped 
within). Bars, 1 cm. Crystals courtesy of F. Pfeifer, Darmstadt, Ger- 
many. Photographs by F. Pfeifer, Darmstadt, Germany (panel A), 
and A. Kletzin (panels B to D). 


It is well known that haloarchaea can be trapped in salt 
crystals and remain viable (Fig. 11) (123). While this 
is not a method commonly used by culture collections, 
many isolates have been preserved this way (292). 
Archaea tend to predominate in the oxic zones of 
the water column of hypersaline environments while 
Bacteria are more abundant in anoxic sediments (270). 
The majority of the archaeal fraction of anoxic zones is 
Halobacteriales despite their predominantly aerobic 
metabolism. Considerable numbers of extremely 
halophilic methanogens are also found that grow on 
trimethylamine and related compounds; isolates in- 
clude Methanohalophilus mahii and Methanohalo- 
bium evestigatum with optimal salt concentrations of 
12% and 24%, respectively (297, 446). 16S rDNA sig- 
natures of soil Crenarchaeota have also been detected 
from the microbial mats at the bottom of salterns and 
salt lakes, although none have been cultivated (270). 
Twenty-two genera are described that belong to 
the Halobacteriales (Table 2). They have a high de- 
gree of morphological variation (Fig. 12). Many, such 
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Table 2. Characteristics of halophilic Halobacteriales (Euryarchaeota)* 


Species 


Halobacterium salinarum 
NRC-1 and DSM 670 


Halobacterium noricense 
Halalkalicoccus tibetensis 


Haloarcula marismortui 


Halobaculum gomorrense 
Halobiforma haloterrestris 


Halococcus morrhuae 


Haloferax mediterranei 
Haloferax volcanii 


Haloferax sulfurifontis 
Halogeometricum borinquense 


Halomicrobium mukohataei 
Haloquadratum walsbyi 


Morphology 


Rods 


Rods 
Coccoid 


Pleomorphic flat 


Rod 
Pleomorphic 


Coccus 


Pleomorphic flat 
Pleomorphic flat 


Pleomorphic flat 


Pleomorphic flat 
Rod 


Pleomorphic flat, 


square 


Source habitat 


Salted cow hide 


Permian rock salt, 
Austria 

Lake Zabuye, 
Tibet, China 

Dead Sea, Israel 


Dead Sea, Israel 
Hypersaline soil, 

Aswan, Egypt 
Dead Sea, Israel 


Saltern, Spain 
Dead Sea, Israel 


Zodletone spring, 
Oklahoma, USA 
Saltern, Puerto Rico 

Salt flats, Argentina 
Salterns, Sinai, Egypt 


Temperature 


optimum (°C) optimum 


35-50 


45 


40 


40-50 


40 
42 


30-45 


40 
40 


32-37 
40 


40 
40 


pH 


5.2-8 


5.2-7.7 
9.5-10 


Neutral 


6-7 
Zea 


Neutral 


6.5 
Neutral 


6.4-6.8 
7 


Neutral 
Neutral 


NaCl optimum 
(M) (range) 


4-5 (3-5.2) 


3 (2.2-5.2) 
3.4 (1.4-5.2) 


3.4-3.9 (1.7-5.1 


1.5-2.5 (1.0-2.5 
3.4 (2.2-5.2) 


3.5-4.5 (2.5-5.2 


2.9 (2.2-5) 
1.7-2.5 


2.2-2.6 (1.1-5.2) 
3.4-4.3 (1.4-5.2) 


3-3.5 (2.5-4.5) 
3.3 


Anaerobic 
growth? 


Arginine 
fermentation 


Nitrate 


Nitrate 


Nitrate 
Nitrate, sulfur 

(weak) 
Sulfur 


Nitrate 
Nitrate 
n.r. 


No. of 
species 
in genus 


3 


10 


Remarks“ 


G, photoheterotrophic with 
bacteriorhodopsin, gas 
vesicles 

Requires 0.6-0.9 M Mg?+ 


G*, crystal structure of 
ribosomes; genome com- 
posed of one chromosome 
and 8 (mega-) plasmids 

G*, requires 0.6-1.0 M Mg?* 

Polyhydroxybutyric acid 
granules 

Sulfated heteropolysaccharide 
cell wall 

Gas vesicles 


G 


First species with sulfur 
respiration 
Gas vesicles 


G, giant cells observed (540 Xx 
40 um); polyhydroxybutyric 
acid granules; requires 
pyruvate; high Mg?* 
tolerance (>2 M) 


SE 


Halorhabdus utahensis 


Halorubrum saccharovorum 


Halorubrum vacuolatum 
Halorubrum lacusprofundi 
Halosimplex carlsbadense 


Haloterrigena thermotolerans 


Natrialba asiatica 

Natrialba magadii 

Natrinema pallidum 
Natronobacterium gregoryi 
Natronococcus occultus 
Natronolimnobius baerhuensis 


Natronomonas pharaonis 
Natronorubrum tibetense 


“Natronorubrum thiooxidans” 


Pleomorphic 


Rods 


Rods 
Rods 
Rods 


Rods 


Rods 
Rods 
Rods 
Rods 
Coccus 
Rods 


Rods 
Pleomorphic 


Pleomorphic 


Great Salt Lake, 
Utah, USA 


Saltern, California, 
USA 

Lake Magadi, Kenya 

Deep Lake, Antarctica 

Permian rock salt, 
New Mexico, USA 


Salterns, Cabo Rojo, 
Puerto Rico 
Beach sand, Japan 
Lake Magadi, Kenya 
Salted cod 
Lake Magadi, Kenya 
Lake Magadi, Kenya 
Soda lakes, Inner 
Mongolia, China 
Wadi Natrun, Egypt 
Alkaline salt lake, 
Tibet, China 
Kulunda steppe, 
Altai, Russia 


50 


50 


35-40 
33-40 
37-40 


50 


35-40 
37-40 
37-40 
37 

35-40 
37-45 


45 
45 


30-32 


6.7-7.1 


Neutral 


Neutral 
Neutral 
7-8 


4.7 (1.6-5.1) 


3.5-4.5 (1.5-5.2) 


3.5 (2.6-5.1) 
3.5 (2.6-5.1) 
4.4 


3-3.5 (2.2-5) 


3.5-4 (2.0-5.2) 
3.5 (2-5.2) 
3.4-4.3 

3 (2-5.2) 
3.4-3.8 (1.4-5.2) 
3.4 (2.7-5) 


3.5 (2-5.2) 
3.4 


3.5 (2.5-5) 


Glucose 
fermentation, 
sulfur 
respiration 


Thiosulfate 


14 


NUUA 


Requires sugars, no growth 
on peptides/amino acids 


Gas vesicles 

Gas vesicles 

Growth only on defined media: 
acetate + glycerol, glycerol 
+ pyruvate, or pyruvate; 
three nonidentical 16S 
rDNA genes 

Growth also at 60°C 


G*, no colony pigmentation 
Red colony pigmentation 


Mixotrophic growth: acetate + 
thiosulfate oxidation in 
symbiosis with tetrathionate- 
oxidizing 
proteobacterium 


“Data are compiled from Oren (292), and from original descriptions accessed from the taxonomy browser (http://www.ncbi.nih.gov). All genera of the Halobacteriales are represented. All strains grow aerobically by respiration, 


and colonies are usually red. 
b. 


n.r., not reported; —, no anaerobic growth. 
€G, genome sequence available; G*, draft genome sequence (120). 


Figure 12. Morphology of haloarchaea. (A) Negative stain image of aerobically grown Halobacterium salinarum PHH1 with 
gas vesicles. (B) Negative stain image of aerobically grown H. mediterranei with gas vesicles. (C) Phase-contrast micrograph 
of H. volcanii WFD11 gas vesicle AD mutant (see Fig. 14). (D) Phase-contrast micrograph of Halorubrum vacuolatum with gas 
vesicles. (E, F) Negative stain electron micrographs of H. mediterranei cells showing pleomorphic shape. (G) Phase-contrast 
micrograph of H. volcanii WFD11 wild type. Photographs courtesy of T. Hechler, Darmstadt. 
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as Halobacterium, are rods, and others, such as Na- 
tronococcus, are cocci. Many are flagellated and 
motile. The most characteristic morphology for mem- 
bers of the Halobacteriales (e.g., Haloferax) are flat- 
tened cells that look like pleomorphic discs when 
viewed from the top and thin rods when viewed from 
the edge (Fig. 12). This morphology is otherwise 
uncommon in the Archaea, with the exception of 
Pyrodictium spp. (see “Pyrodictiaceae,” below). Some 
haloarchaea have a square shape that resembles a 
postage stamp (also referred to as “Walsby’s square 
bacterium”; Fig. 4) (423). The length of the sides of 
square cells is typically <10 um, with some occa- 
sionally =40 wm, and the thickness of cells is typi- 
cally 0.25 um. The thickness apparently remains con- 
stant with cell growth and cell division (192, 390). 
Cells with this morphology are stable only when the 
cytoplasm is isoosmotic with the environment. Square 
archaea account for up to 30% of the population in 
many solar salterns and salt lakes worldwide. Despite 
being observed for many years, it was not until 2004 
that isolates were first propagated in the laboratory 
and the species Haloquadratum walsbyi was de- 
scribed (33, 48). The isolation was successful using 
conventional methods and considerable persistence. 
H. walsbyi cells contain gas vesicles, similar to many 
other haloarchaea (Fig. 4) (see “Gas vesicles and 
transformation systems,” below). The genome for 
H. walsbyi is currently being sequenced (http://www. 
biochem.mpg.de/oesterhelt/). 

New strains have recently been isolated from 
salt-marsh sediments from the River Colne, Essex, 
United Kingdom, that grow at substantially lower salt 
concentrations than other haloarchaea (2.5 to 5% 
NaCl) and appear to represent a new genus within 
the Halobacteriaceae (313). Presumably, other halo- 
archaea adapted to low-salt concentrations will be 
found in the future and further expand knowledge of 
Archaea that are adapted to less extreme habitats. 

The hypersaline Deep Lake (Vestfold Hills, 
Antarctica) is interesting because it has a salt concen- 
tration near saturation (3.6 to 4.8 M) and remains 
ice-free throughout the year (51). The water tempera- 
ture fluctuates between +10°C and —15°C. In situ 
microbial productivity is low, and only two isolates 
have been obtained, a Dunaniella species (Eucarya) 
and Halorubrum lacusprofundi (51, 99). The latter 
is a heterotrophic haloarchaeon with a T,,, in the 
mesophilic range (Tables 2 and 7) (99). Cold adapta- 
tion has been linked to the presence of unsaturated di- 
ether lipid in cells grown at 12°C (111). 

Hypersaline waters accumulate in depressions in 
the Red Sea and in a range of other, mostly warm, 
marine zones worldwide. The temperatures can oc- 
casionally exceed 60°C. Thermophilic and halophilic 


microorganisms have not yet been cultivated from the 
brines. Molecular analysis revealed the presence of 
novel and previously uncultivated archaea and bac- 
teria (74) and the retrieval of potentially valuable and 
novel hydrolases. An exciting finding was an esterase 
that showed remarkable activity in response to tem- 
perature, salt, and pressure (89). 


Bacteriorhodopsins and light-driven ATP synthesis 


Some species, such as H. salinarum, synthesize 
the membrane protein BR, primarily under condi- 
tions of light and low aeration. It contains covalently 
bound retinal as the single chromophor inserted be- 
tween seven transmembrane helices (Fig. 13). BR is a 
light-driven proton pump generating an electrochem- 
ical gradient used for ATP synthesis by a proton- 
translocating, membrane-bound FoF, ATP synthase. 
BR has a purple coloration with a marked absorp- 
tion maximum at about 570 nm (reviewed, for ex- 
ample, in references 75, 229). The molecules are not 
evenly distributed in the cytoplasmic membranes but 
localized in patches with the appearance and proper- 
ties of two-dimensional (2D) crystals. This purple 
membrane can easily be isolated with sucrose density 
gradient centrifugation and separated from the red- 
dish bacterioruberin-containing cytoplasmic mem- 
brane (see Chapter 22). 

The resolution of the 3D structure of the inte- 
gral membrane domains of BR was the first example 
of the use of the regular array of 2D crystals to en- 
hance electron microscopic images to high resolution 
(142). Additional spectroscopic studies revealed that 
the retinal in BR is an all-trans conformation of the 
C20 chain in the ground state (Fig. 13). It becomes ex- 
cited upon light absorption and converts to the cis 
conformation that is coupled to the transfer of a pro- 
ton to the outside surface of the membrane. The reti- 
nal returns to the ground state with the uptake of a 
proton from the cytoplasmic side of the membrane 
(75, 229). The electrochemical gradient can also drive 
Na* export and K* and nutrient import, in addition 
to ATP synthesis. The Na*/H* antiporter also drives 
the uptake of amino acids indirectly, which is medi- 
ated by a Na* symport system. 

In addition to BR, H. salinarum synthesizes at 
least three different retinal proteins with other func- 
tions. Chloride is accumulated as a counter ion to Kt 
in the cytoplasm by a light-driven chloride pump, 
named halorhodopsin. The retinal binds Cl~ and 
transfers it into the cell. Sensory rhodopsins control 
phototaxis together with two-component regulators 
and affect flagellar rotation (378) (see Chapters 11 
and 18). These retinal proteins were thought to only 
be present in archaea until the related proteorhodopsin 
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Figure 13. Three-dimensional model of the Natronomonas pharaonis sensory rhodopsin highlighting a-helices and retinal 
(black) and chemical structure of the all-trans form of retinal (PDB code 1H2S). 


genes were detected in uncultured marine bacteria, and 
xanthorhodopsin genes in the extremely halophilic 
bacterium, Salinibacter ruber (13). In S. ruber, one of 
the genes resembles bacterial proteorhodopsins, while 
the others are of the haloarchaeal type. The function 
of proteorhodopsins appears to be similar to BR. 


Gas vesicles and transformation systems 


Many haloarchaea synthesize gas vesicles that are 
easily recognized as bright refractive bodies under 
phase-contrast microscopy (Figs. 4 and 12). These 
flotation devices are filled with gas in a watertight pro- 
teinaceous shell (422). Gas vesicles are used to adjust 
the buoyant density of the cells, enabling them to float 
at the water surface (Fig. 11) or, more commonly, to 
adjust the mean density of the cell to float in zones of 
the water column that are optimal for the growth of 
the organism (e.g., oxygen tension). Gas vesicles have 
mainly been studied in halophilic archaea and in 
cyanobacteria, but they are not restricted to these 
groups and occur in many freshwater bacteria such as 
Ancalomicrobium and Magnetospirillum (380, 422). 
Surprisingly, gas vesicle genes have also been found in 
soil bacteria, including Streptomyces spp. and Bacil- 


lus megaterium (238, 414), and the heterologous ex- 
pression of the B. megaterium gene cluster led to gas 
vesicle production in E. coli. Gas vesicles are also pre- 
sent in Methanosarcina and Methanothrix (188), and 
gas vesicle genes are present in genome sequence of 
Methanosarcina barkeri (Fig. 14). 

Haloarchaeal gas vesicles are spindle- or cylinder- 
shaped structures with conical ends, ~200 nm in di- 
ameter and 400 to 1,500 nm in length (287, 422). 
The vesicles consist of more than 90% of the small 
hydrophobic protein, GvpA, that forms semicrys- 
talline, 4.5-nm-wide helical ribs that are perpendicu- 
lar to the gas vesicle’s longitudinal axis. The second 
structural protein, GvpC, contains internal amino 
acid repeats that are reported to bind to each of two 
different ribs (422), and it is thought to stabilize the 
vesicles in cyanobacteria. The deletion of the gupC 
gene in recombinant haloarchaea resulted in pleomor- 
phic gas vesicles with variable diameter and rounded 
ends, instead of regularly arranged vesicles (287). 
These results suggest that GvpC plays a role in the de- 
termination of the shape of gas vesicles (287). 

Deletion and overexpression analysis showed 
that some of the proteins encoded in the haloarchaeal 
gene cluster (e.g., GvpD and GvpE) have regulatory 
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Figure 14. Gas vesicle operons in Archaea. gvp genes are shown as boxes above or below according to their direction of tran- 
scription, and promoter regions are denoted by arrows; p-vac, plasmid-encoded vac region; c-vac, chromosomally encoded 
vac region; mc-vac, chromosomally encoded vac region of H. mediterranei; AD, in-frame deletion mutant of the mc-vac region, 
leading to gas vesicle-overproducing H. volcanii transformants; nv-vac, chromosomally encoded vac region of Halorubrum 
vacuolatum (formerly Natronobacterium vacuolatum); ISH30, insertion element; mb-vac, chromosomally encoded vac region 
from the genome of Methanosarcina barkeri; GvpE and GvpD, transcriptional activator and repressor, respectively. Modified 
from a figure provided by F. Pfeifer, Darmstadt, with reference to several sources (287, 305, 306) and the M. barkeri genome 


sequence (GenBank NC_007355). 


functions and are not required for gas vesicle synthe- 
sis. Other genes encoded in the cluster (e.g., GvpI 
and GypH) play a role in determining the length, sta- 
bility, or number of vesicles per cell. The minimal re- 
gion required to synthesize small vesicles comprises 
eight of the fourteen gup genes that are present in 
most haloarchaeal gup clusters (Fig. 14) (287). The 
Methanosarcina barkeri gvp cluster contains the 
same conserved genes that have been described as es- 
sential. The role of most of the genes has not been 
well defined. It has been suggested that some genes 
(EKG,J,L,M) might encode minor structural proteins, 
since the proteins are detectable in immunochemical 
analysis of whole vesicles (368). The amino acid se- 
quences of GvpJ and GvpM are similar to the main 
structural protein GvpA (305) and may participate 
in gas vesicle formation by serving as a scaffold or 
nucleation core. 

Haloarchaea, such as Haloferax and Halorubrum, 
contain a single gvp cluster in the genome. Multipli- 
cations of the gupA gene are present in Methanosarcina 
(Fig. 14) and are common in bacteria. A haloarchaeal 


exception is H. salinarum PHH1, which contains two 
complete and similar gup clusters. One cluster (c-vac) 
is encoded on a large megaplasmid and the other 
(p-vac) on a 150-kbp plasmid, pHH1. A deletion strain 
PHH4, which contains only the c-vac region, pro- 
duces long, cylinder-shaped vesicles during station- 
ary phase (304). The expression of the p-vac region 
is constitutive and produces small, spindle-shaped 
vesicles. GvpD and GvpE are involved in regulation 
of the gup genes (288). GvpE is a dimer-forming, 
DNA-binding, coiled-coil, leucine-zipper protein that 
is required for transcriptional activation (217) (see 
Chapter 6). GvpD is a repressor that interacts with 
GvpE through protein-protein contacts (457). An in- 
frame deletion of large parts of the Haloferax mediter- 
ranei gupD gene (mc gupD) results in a gas vesicle- 
overproducing phenotype in H. volcanii recombinants 
(Figs. 11, 12, and 14) (87). The cells that are nor- 
mally flat, pleomorphic discs, become coccoid and re- 
semble inflated balls. 

The analysis of gas vesicle synthesis and regula- 
tion has required the use of a guvp-negative H. volcanii 
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strain and transformation systems based on three dif- 
ferent mutually compatible haloarchaeal shuttle vec- 
tors (32) (see Chapter 21). The plasmids for H. vol- 
canii include pWL101 and pMS20, which are general 
cloning vectors that utilize native promoters of cloned 
genes for their expression, and pJAS35, which car- 
ries the strong constitutive ferredoxin promoter for 
overexpression of introduced genes. Quantitative 
measurements of gene expression levels can also be 
determined using a haloarchaeal §-galactosidase 
(bgaH) reporter gene (124). In addition, it is possible 
to generate chromosomal gene knockouts (299). 
Haloarchaea are the most easily genetically manipu- 
lated members of the Archaea (299). 


Archaeocins 


Bacteriocins are excreted peptides that are ribo- 
somally translated or synthesized by peptidyl trans- 
ferases, which have bactericidal activity toward re- 
lated, non-bacteriocin-producing bacterial strains. 
Many members of the Halobacteriales produce Ar- 
chaeocins, proteinaceous archaeal antimicrobials. At 
least ten different archaeocins (also termed halocins) 
have been characterized (reviewed in reference 285). 
Sulfolobus islandicus also produces a 20-kDa ar- 
chaeocin (sulfolobicin) that is attached to S-layer- 
derived protein particles (311). Sulfolobicin only has 
activity against its own species. In contrast, halocins 
have activity against many different species (285). Ar- 
chaeocin activity is usually detectable in cultures at 
the beginning of stationary phase. Halocins can be 
differentiated into two classes. Most consist of pro- 
teins of 30- to 35-kDa molecular mass, while the mi- 
crohalocins are small peptides of 3 to 4 kDa. The 
large proteinaceous halocins are easily inactivated by 
dialysis against low-salt buffers. The peptide micro- 
halocins are more robust and can withstand extensive 
heat treatment and desalting. Both size classes of 
halocins are sensitive to proteases. The 36 amino acids 
of the microhalocin HalS8 are excised from the inte- 
rior of a 311-amino-acid precursor protein by an un- 
known protease activity. Microhalocins exhibit cross- 
kingdom toxicity as they are active against Sulfolobus 
species (Crenarchaeota) in addition to a broad spec- 
trum of antihaloarchaeal activity (Euryarchaeota). 

The mechanisms of halocin immunity was re- 
cently studied with the 76-amino-acid microhalocin, 
HalC8, which is excreted by Halobacterium strain 
AS7092 (395). This halocin has high stability to de- 
naturing agents and has a wide spectrum of antiar- 
chaeal and antibacterial activity that includes most 
haloarchaea and some halophilic bacteria. HalC8 is 
the C-terminal proteolysis product of a 283-amino- 
acid precursor protein. Hall and HalC8 are localized 


to the cytoplasmic membrane. The unprocessed pre- 
cursor and the N-terminal 207-amino-acid domain 
(Hall) are able to block halocin activity of mature 
HalC8 in vitro. Heterologous expression of the hall 
gene in the halocin-sensitive strain, Haloarcula his- 
panica, induced a remarkable HalC8 resistance indi- 
cating that the Hall domain might cause immunity. In 
vitro assays confirmed that a strong interaction be- 
tween Hall and HalC8 exists. The precursor protein 
appears not only to be required to transfer the halocin 
across the cytoplasmic membrane but also the prote- 
olysis product(s) seem(s) to play an important role in 
immunity (395). 


Hyperthermophilic Archaea 


Hyperthermophilic microorganisms 
and their habitats 


Hyperthermophiles are defined as microorgan- 
isms having a Tp, of 80°C or higher, and thermo- 
philes as having a Top of 50 to 80°C (386). Hyper- 
thermophiles are found in most hot-water 
environments worldwide. One hyperthermophilic ar- 
chaeon has been reported to grow at temperatures up 
to 121°C (179). Most hyperthermophiles are Archaea 
(two hyperthermophilic genera of Bacteria versus 
>20 of Archaea; 159). Hyperthermophilic (*) and 
thermophilic species are found in many orders and 
genera of the Archaea. Among the methanogens, all 
species of the genera Methanopyrus*, Methanother- 
mus*, Methanothermobacter, Methanocaldococcus”*, 
and Methanotorris* are thermophiles or hyperther- 
mophiles. Hyperthermophiles or thermophiles are 
also represented by members of the Archaeoglob- 
ales*, Picrophilus, Thermoplasma, Thermococcales*, 
Nanoarchaeum*, Desulfurococcales* , Thermopro- 
teales*, and Sulfolobales*. 

Typical habitats are terrestrial hot springs and 
solfataric areas in geologically active zones, volcanic 
hot spots, geological plate-spreading or subduction 
zones, and fault lines (Fig. 15). Other hot-water en- 
vironments that have not been well studied include 
the deep-subsurface aquifers of the Great Artesian 
Basin in Australia (=100°C), which cover vast areas 
of the continent and contain significant microbial ac- 
tivity (197), and the deep-sea brines (S70°C) at the 
bottom of deep-sea clefts in the Mediterranean Sea 
and Red Sea (see “Halobacteriales,” above) (74). 
More studies have been performed on the spectacu- 
lar “black smokers” submarine structures emitting 
superheated ($350 to 400°C) H>S-reduced fluids that 
form dark clouds when released into the oxidized, 
cold (2 to 4°C) seawater (135). The more common 
“white smokers” (20 to 350°C) also harbor many hy- 
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Figure 15. (See the separate color insert for the color version of this illustration.) Solfatara and Pisciarelli fumaroles. (Left) 
Fumaroles in the Solfatara caldera (Pozzuoli near Naples, Italy) with deposition of sulfur, mercury, and arsenic salts. (Right) 
Fumarole-heated hole with boiling water, typical of habitats for Sulfolobales (Pisciarelli, near Naples, Italy). Photos taken by 
A. Kletzin. 


perthermophiles. Cold seawater is the source of vent- 
ing, which seeps into the fissures of ridge systems, 
reaching heated rocks and/or magma. It is extruded 
after becoming enriched with many minerals from 
passage through rock (e.g., Si, Fe, Mn, Zn, Cu) while 
remaining depleted of others (Mg, Mo). Insoluble 
metal sulfides, sulfates, and/or silica precipitate as 
soon as the vent discharge mixes with seawater, and 
this leads to the formation of porous and impressive 
vent structures (215, 418). The pores are densely pop- 
ulated with hyperthermophilic microorganisms, in- 
cluding methanogens, fermentative Thermococcales, 
and heterotrophic or autotrophic sulfate and sulfur 
reducers (324, 399). 

Submarine vents are not restricted to deep-sea 
settings. There are many examples of shallow hot- 
water vents close to islands, such as those at a depth 
of ~3 mnear Vulcano Island beach, Italy. These were 
the sources of several well-studied hyperther- 
mophiles, including P. furiosus/P. woesei, Thermo- 
coccus celer, and Pyrodictium occultum (Tables 3 to 
5) (91, 389, 448). Similarly, Hyperthermus butylicus 
originates from a similar site at a depth of 8 m near 
São Miguel, Açores, Portugal (447). 

Microorganisms are thermally stratified within 
smoker walls according to their optimal growth con- 
ditions, suggesting that the steep gradients of temper- 
ature, pH, redox potential, and substrates provide 
numerous niches for specialized species of anaerobic 
and aerobic bacteria and archaea (136, 399). Micro- 
bial ecosystems are also thought to exist beneath hy- 


drothermal vent systems and to be dominated by hy- 
perthermophilic, chemolithotrophic methanogens and 
heterotrophic sulfur-reducing Thermococcales (“Hy- 
perSLiME”: hyperthermophilic subvent lithoau- 
totrophic microbial ecosystem) (398). An interesting 
example of an established thermophilic hydrothermal 
environment is the “Lost City” that is located ~15 km 
away from the Mid-Atlantic-Ridge (184) and consists 
of carbonate towers up to 30 m in height with an age 
of >30,000 years. The alkaline vent fluids contain 
large amounts of H, and methane but essentially no 
CO. Methane-producing Methanosarcinales are 
abundant in hotter regions of the “Lost City” vents 
(361), and anaerobic methane-oxidizing archaea 
from the ANME-1 group have been detected in cooler 
regions of the vents [see “(Anaerobic) methane oxi- 
dation by methanotrophic archaea,” below] (185). 
The metabolism of sulfur and inorganic sulfur 
compounds (ISCs) is one of the hallmarks of hyper- 
thermophilic archaea, and this is reflected in the num- 
ber of species that use ISCs as electron acceptor or 
donor (Tables 3 to 9). This type of metabolism cor- 
relates with the abundance of S° and ISCs in volcanic 
settings. ISCs can account for $10% of the dry vol- 
ume of the gaseous emissions from terrestrial sources 
or 1% of fluid volume emitted from hydrothermal 
vents (Fig. 15). The proportion of SO2, SO3, HS, and 
S° can vary (391). For example, HS dominates in ge- 
ologically older terrestrial hydrothermal fields and 
submarine vents. In contrast, SO, and SO; are more 
abundant in younger or more active volcanic sites. 
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Table 3. Characteristics of hyperthermophilic Thermococcales (Euryarchaeota) and Nanoarchaeum growing at neutral or alkaline pH? 


Species 


Pyrococcus furiosus/woesei® 


Pyrococcus horikoshii 


Pyrococcus abyssi 


Thermococcus 
kodakaraensis 


Thermococcus litoralis 


Thermococcus celer 
Thermococcus alkaliphilus 


Palaeococcus ferrophilus 


Nanoarchaeum equitans 


Source/habitat 


Shallow vent, Vulcano 
Island, Italy 


Okinawa Trough vents, 


Northeastern Pacific 
Ocean 

Deep-sea vent, North 
Fiji Basin, Pacific 
Ocean 


Shallow vent, Kodakara 


Island, Kagoshima, 
Japan. 

Shallow vent, Bay of 
Lucrino, Naples, 
Italy 

Shallow vent, Vulcano 
Island, Italy 

Shallow vent, Vulcano 
Island, Italy 

Deep-sea vent, North 
Fiji Basin, Pacific 
Ocean 

Kolbeinsey Ridge, 
Iceland 


Temperature optimum pH optimum 
(range) (°C) 


100-103 (70-105) 


98 


96 


96 


88 


87 


85 


83 


90 


80-102) 


67-100) 


67-100) 


65-95) 


-93) 
54-91) 


60-88) 


70-98) 


(range) 
7 (5-9) 


7 (5-8) 


6.8 (4-8.5) 


6.9 (4-8.5) 


7.2 (6.2-8.5) 


5.8 (4-7) 
9 (6.510) 


6.0 (4-8) 


6 (4-8.5) 


Metabolism 


Heterotrophic 
mixotrophic 
Heterotrophic/ 
mixotrophic 


Heterotrophic/ 
mixotrophic 


Heterotrophic/ 
mixotrophic 


Heterotrophic/ 
mixotrophic 


Heterotrophic/ 
mixotrophic 
Heterotrophic/ 
mixotrophic 
Heterotrophic/ 
mixotrophic 


Unclear 


Substrates 


Amino acids, maltose, 
complex organics 
Complex organics 


Amino acids, maltose, 
starch, pyruvate, 
complex organics 

Amino acids, maltose, 
complex organics 


Peptides, pyruvate 

Amino acids, sugars, 
peptides 

Amino acids, peptides 


Complex organics 


H,/CO, 


Electron 
acceptors 


0 py+ 

5°, H 
organics 

S°, organics 


S°, organics 


S°, organics 


S°, organics 


S°, organics 
S°, organics 


S°, organics 


s? 


No. of 
species 
in genus 


5 


29 


Remarks” 


G, structural 
genomic project 


G 


High G+C 


species 


Grows exclusively 
with Igniococcus 
strain KIN/4 in 
symbiosis; organ- 
ism with smallest 
genome to date 


“Data compiled from Bertoldo and Anthranikian (28) and from original descriptions from the taxonomy browser (http://www.ncbi.nih.gov). All Thermococcales strains grow anaerobically by fermentation. Cells are 
coccoid. All genera of the Thermococcales are represented. 


ÞG, genome sequence available. 


Very closely related and probably strains of a single species (133). 
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Table 4. Characteristics of anaerobic, hyperthermophilic sulfate reducers from the Euryarchaeota (Archaeoglobus, Ferroglobus, and Geoglobus) and Crenarchaeota (Caldivirga) kingdoms* 


Temperature No. of 
Species Morphology Source/habitat optimum pH range Metabolism Substrates Electron acceptors species in Remarks? 
(range) (°C) genus 
Archaeoglobus Irregular coccoid Shallow vents, 83 (50-90) 5.5-8 Facultatively Hn, lactate, Sulfate, sulfite, 4 G 
fulgidus Vulcano Island, Italy; chemolitho- pyruvate, thiosulfate 
submarine vents, autotrophic formate 
oil wells 
Ferroglobus Irregular coccoid Shallow vents, 85 (65-95) 6-8.5 Facultatively e?+, sulfide, H3 Nitrate, 1 Anaerobic degradation 
placidus Vulcano Island, chemolitho- thiosulfate of acetate and aromatic 
Italy autotrophic hydrocarbons 
Geoglobus Pleomorphic Submarine 88 (65-90) 6-7.5 Facultatively H3, organic acids, Fe?*, sulfide, Hy 1 
ahangari hydrothermal chemolitho- amino acids, complex 
vents, Guaymas autotrophic organics 
Caldivirga Straight or Mt. Maquiling, 85 (60-92) 2.3-6.4 Heterotrophic Glycogen, gelatin, Sulfate, S°, thiosulfate 1 
maquilingensis curved rods Laguna, Philippines amino acids, complex 


organics 


“Data compiled from several sources (129, 137, 162, 180). 
PG, genome sequence available. 
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Table 5. Characteristics of hyperthermophilic Desulfurococcales (Crenarchaeota) growing at approximately neutral pH? 


Temperature pH Aerobig Electron No. of 
Species Morphology Source/habitat optimum optimum ‘ Metabolism Substrates species Remarks? 
X anaerobic acceptors 
(range) (°C) (range) in genus 
Desulfurococcus Cocci Terrestrial hot spring 76-93 6 (4.5-7) Anaerobic Heterotrophic Complex S? 5 Fermentation, sulfur 
mucosus areas, Iceland organics respiration 
Aeropyrum Cocci Shallow marine vent, 90-95 7 (5-9) Aerobic Heterotrophic Complex Oz 2 G; thiosulfate stimulatory 
pernix Kodakara Island, (70-100) organics 
Japan 
Ignicoccus Cocci Marine vents, 90 (70-98) 6 Anaerobic Chemolitho- H3 se 3 S/H, autotrophy 
hospitalis Kolbeinsey Ridge, autotroph 
KIN/4 Iceland 
Staphylothermus Cocci Shallow marine vent, 65-98 4.5-8.5 Anaerobic Heterotrophic Peptides, S° 2 
marinus Vulcano Island, complex 
Italy organics 
Stetteria Cocci Shallow vent, Milos, 95 (68-102) 6 (4.5-7) Anaerobic  Mixotrophic H3, complex S°, thiosulfate 1 
hydrogenophila Greece organics 
Sulfophobococcus Cocci Terrestrial hot spring 85 (70-95) 7.5 (6.5-8.5) Anaerobic Heterotrophic Yeast extract Organics 1 Fermentation 
zilligii areas, Iceland 
Thermodiscus Disc-shaped Shallow vent, Vulcano, 90 (75-98) 5.5 (5-7) Anaerobic Heterotrophic Complex S°, organics 1 Fermentation, 
maritimus cocci Island, Italy organics sulfur respiration 
Thermosphaera Cocci Terrestrial hot spring 85 (65-90) 6.5 (5-7) Anaerobic Heterotrophic Complex Organics 1 Fermentation 
aggregans areas, Yellowstone, organics 
WY, USA 
Pyrodictium Disc-shaped, Marine shallow vents, 105 (85-110) 5.5 (4.5-7.2) Anaerobic Chemolitho- H3 S°, thiosulfate 3 S/H, autotrophy 
occultum flat Vulcano Island, autotroph 
Italy 
Pyrodictium Discs Deep-sea vent, TAG 97 (80-110) 5.5 (4.7-7.1) Anaerobic | Chemolitho- H3, organics S° S/H, autotrophy; 
abyssi site, Mid-Atlantic autotroph fermentation 
Ridge 
Hyperthermus Cocci Marine shallow vent, 95-106 7 Anaerobic Heterotrophic Peptides Organics 1 G, fermentation 
butylicus São Miguel, Açores (72-108) of peptides 
Pyrolobus Irregular Deep-sea vent, TAG site, 106 (90-113) 5.5 (4-6.5) | Anaerobic/ Chemolitho- H3 Nitrate, 1 “Hottest” validly 
fumarii cocci Mid-Atlantic Ridge micro- autotroph thiosulfate, O2 described organism 
aerophilic 
Strain 121 Cocci Deep-sea Mothra 106 (80-121) nr‘ Anaerobic Heterotrophic Formate Fe? - “Hottest organism” 
vent field, Juan de 
Fuca Ridge, east 
Pacific 
Acidilobus Cocci Terrestrial acidic 85 (60-92) 3.8 (2-6) Anaerobic Heterotrophic Starch, S°, organics 1 
aceticus hot springs, complex 
Kamchatka, Russia organics 
“Caldococcus Cocci Terrestrial acidic 92 (70-92) 3 (1.5-4) Anaerobic Heterotrophic Complex S°, organics 1 
noboribetus” hot springs, Japan organics 
Ignisphaera Cocci Terrestrial solfataric 92-95 6.4 (5.4-7) Anaerobic Heterotrophic Sugars, Organics 1 Strictly fermentative, 
aggregans fields, Rotorua, (72-108) carbohydrates, consortia with 
New Zealand complex Pyrobaculum sp. 
organics 


“Data compiled from Huber and Stetter (155), and from original descriptions obtained from the taxonomy browser (http://www.ncbi.nih.gov). All genera of the Desulfurococcales are represented. 
ÞG, genome sequence available. 


en.r., not reported. 


Table 6. Characteristics of hyperthermophilic Thermoproteales (Crenarchaeota) growing at acidic to neutral pH* 


SP 


Temperature pH ; No. of 
; : . 3 Aerobic/ Electron ; b 
Species Source/habitat optimum optimum . ; species Remarks 
7 anaerobic Metabolism Substrates acceptors . 
(range) (°C) (range) in genus 
Thermoproteus Solfataric field, 88 (70-98) 5.5 (2.5-6) Anaerobic Facultatively H3, amino acids, s° 3 G 
tenax Krafla, Iceland (moderately lithoautotrophic sugars, complex 
aerotolerant) organics 
Thermoproteus Solfataric field, 85 (-97) 6.5 (5-7.5) Anaerobic Facultatively H3, amino acids, Se 
neutrophilus Iceland lithoautotrophic sugars 
Pyrobaculum Maronti Beach, 100 (75-104) 7 (5.8-9) Facultatively Facultatively H3, thiosulfate, O», nitrate, 7 G, this represents 
aerophilum Ischia Island, anaerobic lithoautotrophic complex organics nitrite, Fe3*, the only marine 
Italy arsenate species of 
Pyrobaculum 
Pyrobaculum Geothermal 100 (74-102) 6 (5-7) Anaerobic Facultatively H3, complex organics S°, thiosulfate, 
islandicum power plant, lithoautotrophic sulfite, Fe>*, 
Iceland cysteine, oxid. 
glutathione 
Thermocladium Acidic hot spring 75 (45-82) 4.2 (2.6-5.9) Anaerobic Heterotrophic Complex organics S°, thiosulfate, 1 
modestus areas, Japan (moderately sulfate, cysteine 
aerotolerant) 
Caldivirga Acidic hot springs, 85 (60-92) 3.7-4.2 Anaerobic Heterotrophic Complex organics S°, thiosulfate, 1 
maquilingensis Philippines (2.3-6.4) (moderately sulfate 
aerotolerant) 
Vulcanisaeta Hot spring areas, 85-90 4-4.5 Anaerobic Heterotrophic Complex organics S°, thiosulfate, 2 
distributa eastern Japan 
Thermofilum Solfataric fields, 88 (70-95) 5.5 (4-6.5) Anaerobic Heterotrophic Peptides ge 2 Requires extract 
pendens Iceland of other archaea 


“Data compiled from Huber et al. (153), and from original descriptions from the taxonomy browser (http://www.ncbi.nih.gov). Cells are stiff rods (Thermoproteaceae) or very thin long rods (Thermofilaceae). All genera of the 
Thermoproteales are represented. 
bG, genome sequence available. 


oF 


Table 7. Characteristics of the cold-adapted Nitrosopumilus, Cenarchaeum (Crenarchaeota), and SM1 (Euryarchaeota)* 


Species Morphology 
Nitrosopumilus Short, thin rods 
maritimus 
Cenarchaeum Short, thin rods 


Euryarchaeon SM1 Cocci 


Source/habitat 


Aquarium seawater 
tank, Seattle, 
WA, USA 

Pacific coast off 
California, USA 


Cold sulfidic springs, 
Germany 


Temperature 


(°C) 


28 


10-11 


pH 


7-7.2 


7.2-7.3 


Aerobic/ 
anaerobic 
growth 


Aerobic 


Aerobic 


n.r. 


Type of 
metabolism 


Chemolithotrophic 


Electron 
acceptors 


O2 


Electron 
donors 


NH; 


Remarks? 


G, endosymbiont 
in the marine 
sponge Axinella 
mexicana 

Cultivation only 
in situ, consortia 
with Thiothrix- 
like bacteria 


“The cold-adapted haloarchaeon H. lacusprofundi is described in Table 2, and the methanogens M. burtonii and M. frigidum are described in Table 10. Data compiled from several sources (51, 144, 212, 335). n.r., not reported. 


PG, genome sequence available. 
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Table 8. Characteristics of thermoacidophilic Thermoplasmatales (Euryarchaeota)* 


pH Aerobic/ No. of 
Species Morphology Source/habitat Temperature optimum anaerobic Substrates Electron species Remarks” 
optimum (°C) (range) growth acceptors in genus 
Thermoplasma Pleomorphic Smoldering coal 59 1-2 (0.5-4) f Peptides Oz, S° 2 G 
acidophilum refuse piles, (carbohydrates) 
solfataric fields 
Thermoplasma Pleomorphic Solfataric fields 59 2 (1-4) f Peptides Op, Se G 
volcanium (carbohydrates) 
Picrophilus Irregular cocci Geothermally heated 60 0.7 (0-3.5) Aerobic Yeast extract O2 2 G 
torridus acid soil, Kawayu 
Onsen, Japan 
Ferroplasma Pleomorphic Metal-leaching plant 35 1.7 (1.3-2.2) Aerobic Feo, pyrite Oo, S? 3 G, autotrophic 
acidiphilum 
Ferroplasma Pleomorphic Acid mine drainage, 37 1.2 (0-2.5) Aerobic Feo, pyrite, Op, S° 2 G, autotrophic, 
acidarmanus Iron Mountain, yeast extract salt and transition- 
SD, USA metal tolerant 


“Data compiled from Huber and Stetter (156), and from original descriptions from the taxonomy browser (http://www.ncbi.nih.gov). All genera of the Thermoplasmatales are represented. 
bG, genome sequence available. 
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Table 9. Characteristics of thermoacidophilic Sulfolobales (Crenarchaeota)* 


Temperature PH Aerobic/ Electron No. of 
Species Source/habitat optimum optimum Metabolism anaerobic Substrates species Remarks? 
(range) (°C) (range) growth acceptors in genus 
Sulfolobus Solfataric fields, 70-75 (55-85) 2-3 (1-6) Aerobic Heterotrophic/ Sugars, complex O2 10 G 
acidocaldarius Yellowstone, mixotrophic organics, H2 
WY, USA 
Sulfolobus Solfataric fields, 78-85 (50-88) 3-4 (2-5.5) Aerobic Heterotrophic/ Sugars, starch, O2 G 
solfataricus Solfatara (Naples), mixotrophic complex organics, 
Italy H2 
Sulfolobus Solfataric fields, 80 (70-85) 2.5-3 (2-5) Aerobic Heterotrophic/ Complex organics, O2 G, sulfur oxidation 
tokodaii Japan chemolithoautotrophic amino acids, S° 
Sulfolobus Solfataric fields, 81 (-86) 3 Aerobic Heterotrophic/ Complex organics, O2 Lysogenic host of 
shibatae Japan mixotrophic sugars, H> virus SSV1 
Sulfolobus Solfataric fields, 65 (50-75) 2-3 (1-4.5) Aerobic Obligatory S°, sulfidic ores O2 Ore bioleaching 
metallicus Iceland chemolithoautotrophic (pyrite, sphalerite, 
and chalcopyrite) 
Acidianus Solfataric fields, 80 (-87) 2.5 (0.8-3.5) | Facultatively Obligatory H3, S° Op, S° 7 $°/H, autotrophy, 
ambivalens Leirhnukur fissure, anaerobic chemolithoautotrophic sulfur oxidation 
Iceland 
Acidianus Solfataric fields, 85-90 (55-95) 2 (1-5.5) Facultatively Obligatory Hb, S°, sulfidic ores Oz, S° $°/H, autotrophy, 
infernus Solfatara (Naples), anaerobic chemolithoautotrophic (pyrite, sphalerite, sulfur oxidation, 
Italy and chalcopyrite) Knallgas reaction, 
ore bioleaching 
Acidianus Solfataric fields, 70 (40-75) 1.5-2 (1-6) Facultatively Facultatively Complex organics, Op, S? S/H, autotrophy, 
brierleyi Yellowstone, anaerobic chemolithoautotrophic H>, S°, sulfidic ores sulfur oxidation, 
WY, USA (pyrite, sphalerite, Knallgas reaction, 
and chalcopyrite) ore bioleaching 
Metallosphaera Solfataric fields, 75 (50-80) 2-3 (1-4.5) Aerobic Facultatively Complex organics, O2 3 Sulfur oxidation, 
sedula Solfatara (Naples), chemolithoautotrophic Hp, S°, sulfidic ores Knallgas reaction, 
Italy; smoldering (pyrite, sphalerite, ore bioleaching 
slag heaps and chalcopyrite) 
Stygiolobus Solfataric fields, 80 (57-89) 2.5-3 (1-5.5) | Anaerobic Obligatory Hə Ss? 1 $°/H, autotrophy 
azoricus Caldeira Velha, chemolithoautotrophic 
São Miguel, Açores, 
Portugal 
Sulfurisphaera Solfataric fields, Japan 84 (63-92) 2 (1-5) Facultatively Heterotrophic/ Complex organics, Op, S° 1 
ohwakuensis anaerobic mixotrophic Hp, S° 
Sulfurococcus Smoldering coal 70-75 2-3 Aerobic Facultatively Complex organics, Oo, 2 
mirabilis refuse piles, chemolithoautotrophic S°, sulfidic ores 


solfataric fields 


(pyrite, sphalerite, 
and chalcopyrite) 


“Data compiled from Huber and Prangishvili (154), and from original descriptions from the taxonomy browser (http://www.ncbi.nih.gov). Cells of all species are irrecular cocci, and sometimes lobed cocci. All genera of the Sul- 
folobales are represented. 
ÞG, genome sequence available. 
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Elemental sulfur is deposited from H-S by compro- 
portionation (i.e., the opposite of disproportionation) 
with SO, or oxidation with air. In seawater, sulfate is 
the predominant sulfur species (and the second most 
abundant anion) providing the main substrate for sul- 
fate reducers in submarine vents (see “Hyperther- 
mophilic sulfate-reducing Euryarchaeota: the Archaeo- 
globales,” below) (263). Oxidation and reduction of 
ISCs form an important energy source for hyperther- 
mophilic archaea but there are also numerous species 
that are able to use other inorganic or organic sub- 
strates for energy metabolism (e.g. nitrate, ferric iron, 
etc; Tables 3 to 8), and it is likely that most of the 
metabolic pathways represented in mesophiles will 
also be found in hyperthermophiles. 


Thermal adaptation 


The large number and variety of (hyper-) ther- 
mophilic archaea attest to the ability of cells to adapt 
to growth at high temperatures. The fluidity of mem- 
branes is adjusted by varying the proportion of 
tetraether versus diether lipids and/or by the degree of 
cyclization of the lipids (see “Membranes” above). 
The degree of modification in stable RNAs, in par- 
ticular, tRNAs, tends to increase with growth tem- 
perature (283). The histones in Archaea, which com- 
plex and organize DNA, may also protect DNA 
against thermal denaturation (see “Chromosomes, 
DNA structure and replication,” above). Most small 
molecules, such as (d)NTPs, NAD(P)(H), and vita- 
mins, are reasonably stable at high temperatures at 
neutral pH but not at acidic pH (10). The internal pH 
of the thermoacidophile S. acidocaldarius is 6.5, 
whereas it is 4.6 for Picrophilus oshimae. The differ- 
ence in their internal pH suggests that the half-lives of 
thermal denaturation of small molecules may indeed 
limit growth at high temperatures for P. oshimae but 
not for S. acidocaldarius (268) (see “Thermoplas- 
matales,” below). Some hyperthermophiles accumu- 
late small molecules in the cytoplasm, such as cyclic 
2,3-bisphosphoglycerate (Methanopyrus, Methano- 
thermus) or di-myoinositol phosphate (Pyrococcus), 
which may provide thermal protection (359, 364). 

The thermostability, -lability, and -activity of en- 
zymes broadly correlates with the growth tempera- 
ture of the organism, and the optimal catalytic activ- 
ities of thermostable enzymes tend to exceed the Topt 
of the host (416). For example, the amylopullulanase 
from Thermococcus litoralis is optimally active at 
117°C, which is 29°C above the organism’s Topt (45). 
The Arrhenius plots of orthologous enzymes from 
mesophiles and thermophiles are typically linear, sug- 
gesting that the enzymes’ functional conformations 
remain unchanged in their respective temperature 


ranges (416). The overall sequence and structural 
similarity between homologous enzymes from differ- 
ent thermal classes that possess identical catalytic 
mechanisms is often very high, making it difficult to 
identify thermostability determinants of comparative 
protein sets. The successful expression of protein- 
encoding genes from hyperthermophiles in mesophilic 
hosts such as E. coli demonstrates that proteins from 
hyperthermophiles tend to be intrinsically ther- 
mostable and fold properly far below the growth tem- 
peratures of their native hosts. This is also true for 
proteins from hyperthermophiles that synthesize high 
concentrations of cyclic 2,3-bisphosphoglycerate or 
di-myoinositol phosphate (359, 364). In general, ther- 
mostability appears to arise from a few highly specific 
alterations within individual proteins. Stability 
against melting seems to arise from changes in protein 
rigidity, which is affected by a combination of ion 
pairs and ion-pair networks, hydrogen bonding, and 
hydrophobic and Van der Waals interactions. In ad- 
dition, intersubunit interactions, shortened or more 
rigidly anchored loops, stabilization of the N- and 
C-terminal ends of proteins, and less solvent-exposed 
hydrophobic interactions can all lead to a decrease 
in the ability of protein domains to move and there- 
fore contribute to the protein’s stability (416, 427). 

A large amount of TF55 chaperonin-like protein 
is synthesized by Pyrodictium brockii when it is 
grown at 108°C, suggesting that chaperonins in gen- 
eral may be important under conditions close to the 
maximum growth temperature (307) (see Chapter 
10). Only one gene that is common to all hyperther- 
mophiles—encoding a reverse gyrase—has been iden- 
tified in genomewide comparisons (see Chapter 19). 
The enzyme induces positive supercoiling into DNA 
via an ATP-dependent topoisomerase I activity, and 
it was thought to play a major role in thermostabi- 
lization of DNA (94, 194). However, a reverse gyrase 
gene knockout in Thermococcus kodakaraensis does 
not change its phenotype or growth kinetics (11). Al- 
though this does not rule out that reverse gyrase can 
be part of a DNA thermoadaptation mechanism, it 
shows that it is not essential for this organism and 
further highlights the difficulties in attempting to de- 
fine general rules for microbial adaptation to high 
temperatures. 


Heterotrophic Hyperthermophilic Euryarchaeota: 
The Thermococcales 


Heterotrophic Thermococcales and especially 
Thermococcus spp. are easily enriched and isolated. 
They all grow heterotrophically under anaerobic con- 
ditions, usually by fermentation of carbohydrates 
and/or peptides. Elemental sulfur is either essential 
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or stimulatory for growth (Table 3). Three genera 
have been described, Pyrococcus (six species), Ther- 
mococcus (30 species), and Palaeococcus (two species) 
(28). Thermococcales species are abundant organisms 
in marine environments, shallow and deep-sea hy- 
drothermal vents, and continental and offshore oil 
wells, whereas only a minority has been isolated from 
terrestrial samples (e.g., 331). In general, the cells are 
regular or irregular cocci, 1 to 2.5 um in diameter, 
have an S layer with hexagonal symmetry, and are fla- 
gellated (Fig. 16) (28). 

Two of the three genera of the Thermococcales 
have been well defined by their 16S rDNA phylogeny 
and other taxonomic features. The genus Pyrococcus 
was originally used to describe marine organisms 
with a low G+C content (38 to 48%) growing opti- 
mally at =100°C. Pyrococcus species form a coherent 
phylogenetic group that clusters with the chitin- 
degrading, T. chitinophilus (28). The two Palaeococ- 
cus species are characterized by growth at =90°C, pH 
4 to 8, and moderate salinity, and branch more deeply 
in the order Thermococcales (6, 28, 400). P. fer- 
rophilus is barophilic (Pope ~ 30 MPa). Two species 
are distinguished by their G+C content (P. fer- 
rophilus, 53%; P. helgensonii, 42%). The genus 
Thermococcus encompasses at least 30 species that 
differ in their physiology and are phylogenetically 
separated into several clades that include isolates with 
a broad range of G+C content (40 to 58%). The two 
species T. alcaliphilus and T. acidoaminovorans are 
thermoalkaliphiles, while others are mostly neu- 
trophilic. For the purposes of this chapter, neu- 
trophiles, acidophiles, and alkaliphiles are referred 
to as microorganisms that grow optimally in the 
range pH 5 to 8, 0.7 to 4, and 9 to 11, respectively. 


Figure 16. Electron micrograph of Thermococcus celer. Repro- 
duced from Bergey’s Manual of Systematic Bacteriology (208) with 
permission of the publisher. 


Thermococcales grow quickly, with doubling 
times between 25 and 70 min. Many species are suf- 
ficiently aerotolerant to be handled without requiring 
strict anaerobic conditions. They can be easily grown 
in anaerobic jars on solid media (Gelrite or Phytagel). 
Thermococcales are the most common source of com- 
mercially available proofreading DNA polymerases 
for PCR applications (254) (see Chapter 22). Many 
other enzymes of potential interest have also been de- 
scribed from Thermococcales, including proteases, es- 
terases, and ATP-dependent DNA ligases (78, 206, 
258). Genome sequences were determined and pub- 
lished for P. furiosus, P. abyssi, P. horikoshii, and 
T. kodakaraensis (http://www.ncbi.nih.gov), and the 
genome sequence of T. gammatolerans (France) has 
not yet been published (see 41, 42) (see Chapter 19). 
Pyrococcus furiosus has been selected as a model 
organism for a comprehensive transcriptomic/pro- 
teomic/structural genomic project, and as of January 
2006, 30 X-ray structures were available in the Pro- 
tein Data Bank, with many more in progress 
(http://www.secsg.org) (see Chapter 20). An in vitro 
transcription system has been developed for P. furio- 
sus, and this has been instrumental in studying the 
features of archaeal transcription that are similar to 
the Eucarya (see Chapter 6). 

T. kodakaraensis is a promising archaeal model 
organism for future research, although few publica- 
tions are available to date. A versatile gene knockout 
system based on auxotrophy markers has been devel- 
oped. T. kodakaraensis is naturally competent and ef- 
ficient in recombination, and this allows specific genes 
to be deleted without producing polar effects (346, 
347). Single, double, and triple mutations of various 
biosynthesis genes (ApyrF, AtrpE, AbisD) have been 
constructed. The deletion of the reverse gyrase gene 
was performed with this system, demonstrating that 
it is not essential for growth at high temperatures (11) 
(see “Thermal adaptation,” above). 

A remarkable feature of several of the Thermo- 
coccales is their high resistance to ionizing (y) radia- 
tion. At least three Thermococcus isolates have been 
obtained by applying a dose of 30 kGy from a °°Co 
source in enrichment cultures (171, 172). The level 
of radiation resistance is the same order of magnitude 
as for the bacterium Deinococcus radiodurans, which 
exhibits a D1) survival dose (10% colony-forming 
units after irradation) of ~16 kGy (e.g., compared 
with a Dj, dose for E. coli of ~0.7 kGy) (290). Sta- 
tionary-phase P. abyssi cultures are able to completely 
repair their fully fragmented chromosomes (double- 
strand breaks) within 2 h of irradiation (173). P. abyssi 
seems to respond to DNA damage by uncoupling 
DNA repair from DNA synthesis and thus prevents 
the accumulation of genetic errors by exporting dam- 
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aged DNA. Several typical repair proteins, including 
RadA, replication protein A, and replication factor 
C remain chromatin bound before and after irradia- 
tion, and active chromatin-bound repair and replica- 
tion complexes are therefore thought to remain avail- 
able to counteract DNA damage (173). 

The three Pyrococcus genome sequences have 
been used for the analysis of genome evolution and 
for determining their origins of replication (273, 458) 
(see Chapter 19). P. horikoshii and P. abyssi have very 
similar genomes (Fig. 17). It was concluded from the 
syntenic fragments that one reversion and one trans- 


P.horikoshii 


C porat. D 


LCTRta 


— 
LCTR2> LOTRIc LCTR2D 


P.horikoshii 


position event is sufficient to explain the differences 
in the P. abyssi genome after an inferred recent diver- 
gence from P. horikoshii (458). In P. horikoshii and 
P. abyssi, most genes are predicted to be transcribed 
in the same direction as DNA replication within each 
of the two replichores (two chromosome halves de- 
fined by replication origin and terminus). In contrast, 
the P. furiosus genome appears to be more scrambled 
and only short regions of synteny exist with the other 
two species. P. furiosus contains numerous recom- 
bined segments, many of which correlate with bound- 
aries of insertion elements, suggesting that transposi- 
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Figure 17. Genome rearrangements in Pyrococcus species. (A, B) Pairwise genome dot plots: (A) solid arrows with capital let- 
ters indicate large synteny segments conserved between P. horikoshii and P. abyssi; (B) shaded arrows represent the location and 
relative orientation of synteny segments conserved in all three species; small squares denote position of IS elements in the 
P. furiosus genome. (C to E) Genome organization and synteny segments in the three Pyrococcus species. Location and orien- 
tation of the major syntenic genome fragments conserved between P. horikoshii and P. abyssi are marked as above; 10 smaller 
syntenic fragments conserved also with P. furiosus are marked with medium arrows. Dashed arrows represent the two repli- 
chores defined by the origin of replication (oriC) and the putative terminus of replication at the bottom of the arrows; LCTR1 
and 2, two families of long conserve tandem repeats occurring in all species (458). Reproduced with modifications from Nucleic 


Acids Research (458) with permission of the publisher. 
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tion events have caused many disruptions in the 
genome. In P. furiosus, colinearity of genome tran- 
scription and replication only occurs for highly tran- 
scribed genes. 


Metalloproteins, sugar, sulfur, and hydrogen 
metabolism in P. furiosus 


The reduction of S° or protons coupled to the ox- 
idation of carbohydrate and peptides to CO, and small 
organic compounds (e.g., acetate or alanine), is one of 
the most common pathways of energy metabolism in 
hyperthermophilic archaea, in particular, for members 
of the Thermococcales (7, 385) (see Chapters 12 and 
20). P. furiosus has served as a model archaeon for 
studying glucose, maltose, and amino acid metabolism 
and sulfur reduction, although several important as- 
pects remain to be determined. Peptides are the opti- 
mal substrate for fermentation in P. furiosus, and S° is 
required, a property which is also true for other Pyro- 
coccus species (3). In contrast, lower growth rates are 
observed during fermentation of maltose. If S° is in- 
cluded in maltose growth medium, however, significant 
H,S formation occurs despite S? not being required for 
growth or increasing growth rate. 

P. furiosus catabolizes sugars using a modified 
Embden-Meyerhof-Parnas glycolytic pathway (EMP). 
Subsequently, pyruvate:ferredoxin oxidoreductase con- 
verts pyruvate into acetyl-CoA and CO . The pathway 
involves most steps known from bacterial glycolysis 
but utilizes a few unusual enzymes. The modified EMP 
in Pyrococcus includes the ADP-dependent hexokinase 
and phosphofructokinase instead of ATP-dependent 
enzymes (189, 190). The most peculiar step in the gly- 
colytic pathway of many hyperthermophilic archaea 
is the single-step oxidation of glyceraldehyde 3-phos- 
phate (GAP) to 3-phosphoglycerate. The reaction is 
catalyzed by a tungsten-containing GAP:ferredoxin 
oxidoreductase that is related to other aldehyde:ferre- 
doxin oxidoreductases (AOR, Fig. 18) instead of a 
NAD-dependent GAP dehydrogenase (334). 

Three different soluble proteins with S-reducing 
activity have been purified from P. furiosus. Two of 
these enzymes are hydrogenases with additional S° 
or polysulfide reductase activity using Hy or NADPH 
as electron donors (sulfhydrogenases) (248). The 
third enzyme is a ferredoxin:NADP oxidoreductase 
with a broad substrate range, which oxidizes and/or 
reduces many different substrates, including FAD, 
polysulfides, NAD(P)(H), and O, (249). A membrane- 
bound hydrogenase (MBH) purified from P. furiosus 
did not reduce S° or polysulfides (343). MBH activi- 
ties increased in the presence of maltose and in the ab- 
sence of S° (3). The enzyme was part of a membrane- 
bound complex, which appears to couple maltose 


fermentation to proton translocation across the mem- 
brane, driven by proton reduction with ferredoxin as 
electron donor. It constitutes a novel and very simple 
anaerobic electron transport chain. Identifying this 
protein complex helped to resolve questions concern- 
ing energy conservation in Pyrococcus cells growing 
in the absence of S° (342). However, it has not been 
determined whether S° reduction is coupled to elec- 
tron transport (3, 342). Inducible transcripts (mi- 
croarrays) and proteins (2D gels) were identified in 
sulfur-grown, compared with sulfur-free, cultures. 
Two genes, sipA and sipB, which are divergently tran- 
scribed from a common promoter region, had the 
highest increase in mRNA levels (363). The gene 
products are membrane-associated proteins with un- 
known function (3, 342). It remains to be determined 
whether the so far elusive energy-converting sulfur re- 
ductase will be found in P. furiosus. 

Related to its capacity for sulfur metabolism, 
Pyrococcus has been an important source for metal- 
loproteins (in addition to some methanogens and Sul- 
folobus), which have served as models for under- 
standing redox processes in the Archaea, Bacteria, 
and Eucarya. Important findings from research into 
Pyrococcus and Thermococcus metalloproteins in- 
clude the structure determination of a family of 
tungsten-containing AORs (Fig. 18) (334), charac- 
terization of the superoxide reductase/rubrerythrin 
oxygen protection mechanism (1, 440), and analy- 
sis of membrane-bound, proton-translocating, and 
proton-reducing hydrogenases (140, 342). These 
achievements provided an important foundation for 
the Pyrococcus structural genomics project, part of 
which specifically focuses on the metalloproteome 
(http://www.secsg.org). 


Hyperthermophilic 
Sulfate-Reducing Euryarchaeota: 
The Archaeoglobales 


Archaeoglobus fulgidus was the first archaeon to 
be isolated that grows by dissimilatory sulfate reduc- 
tion (2). Archaeoglobus strains are often the first to 
grow when enriching sulfate reducers from samples 
of natural or man-made sulfate-rich biotopes. The 
number of archaea with this growth property on sul- 
fate is limited, with the only other cultivatable isolate 
being Caldivirga (Thermoproteales, Crenarchaeota). 
A. fulgidus uses organic acids such as lactate and/or 
H, as electron donors to reduce sulfate, thiosulfate, 
or sulfite to H3S. In contrast, the phylogenetically re- 
lated archaeon, Ferroglobus placidus, mainly grows 
autotrophically using nitrate or thiosulfate as electron 
acceptors and Fe**, H>, and H3S as electron donors 
(Table 4) (129). Ferroglobus is one of the few archaea 
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Figure 18. (See the separate color insert for the color version of this illustration.) Three-dimensional structures of tungsten-con- 
taining aldehyde:ferredoxin oxidoreductases from Pyrococcus furiosus. (A) Cartoon of the formaldehyde:ferredoxin oxido- 
reductase (FOR), homotetrameric holoenzyme (150). (B) Cartoon of the aldehyde:ferredoxin oxidoreductase (AOR) homo- 
dimeric holoenzyme (53). (C) Peptide chains of AOR (cyan) superimposed on FOR (magenta) showing close structural 
similarity (150). (D) Active-site cavity of the FOR with surrounding residues and glutarate shown (150). (E) [4Fe-4S] cluster 
and the W-(bis-tungstopterin) cofactor of the AOR (53). FOR images reproduced from the Journal of Molecular Biology with 
permission of the publisher (150); AOR images reproduced from Science (53) with permission of the publisher. 


reported to oxidize acetate anaerobically; others in- 
clude the phylogenetically related Geoglobus and 
many methanogens (180, 403). Ferroglobus is also 
the only archaeon reported to oxidize aromatic hy- 
drocarbons anaerobically (404). Archaeoglobus and 
relatives have been repeatedly isolated from oil fields, 
where they grow at the oil/water interface, and are 
partially responsible for well souring and H3S pro- 
duction. A. fulgidus was one of the first members of 


the Archaea to have its genome sequenced (201), re- 
vealing a large number of genes and operon-like 
structures potentially involved in the degradation of 
numerous biopolymers and small compounds, in- 
cluding the 8-oxidation of fatty acids. 

The biochemistry of dissimilatory sulfate reduc- 
tion is essentially the same as in the assimilatory path- 
way (3’-phosphoadenylylsulfate, has not yet been 
found in Archaea). Three enzymes are required (equa- 
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tions 1 to 3; see Fig. 25 below). They are localized in 
the cytoplasm, and all have been purified from A. 
fulgidus (62, 376). 


ATP: sulfate adenylyltransferases 
ATP + SO,2” <> APS + PP; (1) 


Adenylylsulfate (APS) reductase 
APS + 2 e7 + H+ <> AMP + HSO,- 


Sulfite reductase 
HSO; + 6 e7 + 6 H* «e HS7 + 3 H2O (3) 


ATP :sulfate adenylyltransferases (equation 1) 
are key enzymes in sulfate reduction. The homooligo- 
meric enzymes are found in all organisms capable of 
dissimilatory sulfate reduction and consist of 41 to 
69 kDa subunits (376). The distinction between dif- 
ferent types of the enzyme does not correlate with the 
organisms’ phylogeny but rather with growth temper- 
ature. The enzymes from thermophiles, but not from 
mesophiles, contain a zinc-binding site, which may con- 
tribute to the thermal stabilization of the protein (396). 

The crystal structure of the A. fulgidus dissimi- 
latory APS reductase (equation 2) serves as a model 
for this class of enzymes. Structural and spectroscopic 
studies facilitated the determination of the reaction 
mechanism (103, 104). APS reductases are hetero- 
dimeric, FAD-containing, and iron-sulfur (FeS) clus- 
ter-containing enzymes (62). The flavoprotein subunit 
is paralogous to the large subunit of succinate dehy- 
drogenases (SDH). The smaller subunit contains two 
FeS clusters in the APS reductase and does not have 
paralogs or orthologs in other enzymes. The flavin is 
located at the bottom of a pocket of the large subunit 
(104) in proximity to the first [4Fe-4S] cluster. The 
appearance of a broad radical signal in EPR spectra 
following incubation with AMP and sulfite indicates 
that flavosemiquinone may interact with cluster I. 
These results suggest that an FAD N(5)-sulfite adduct 
is the intermediate during both the oxidative and the 
reductive reactions. 

The dissimilatory, siroheme-containing sulfite re- 
ductases (DSRs; equation 3) catalyze the six-electron 
reduction from sulfite to sulfide, and vice versa. They 
are present in all anaerobic sulfate or sulfite reducers 
and in some sulfide-oxidizing bacteria, and they have 
served as specific, amplifiable gene markers in molec- 
ular ecology studies (178). The heterodimeric en- 
zymes may have arisen from a gene duplication event. 
The dsrA and dsrB gene products have a moderate 
level of similarity (~30% amino acid identity be- 
tween A. fulgidus DsrA and DsrB). Both subunits are 
strictly conserved between the Archaea and Bacteria 
(A. fulgidus [archaeon] and Desulfovibrio vulgaris 
[bacterium] have ~60% identity) (62). The siroheme 


in the DSRs is a complex cofactor consisting of a heme 
molecule electrically coupled to a [4Fe-4S] cluster. 

A DSR with an unusual domain composition, co- 
factor specificity, and function (sulfite detoxification) 
was isolated from M. jannaschii (170). M. jannaschii 
and Methanothermobacter thermautotrophicus grew 
with normally toxic levels of sulfite, utilizing it as the 
sole source of sulfur. A protein with an N-terminal 
F420 dehydrogenase and a C-terminal DsrA domain 
was identified that had F429-dependent sulfite reduc- 
tase activity and contained siroheme. Homologs were 
found exclusively in the genome sequences of other 
strictly hydrogenotrophic and sulfite-resistant ther- 
mophilic methanogens. This type of sulfite resistance 
may provide an evolutionary selective advantage. 

The Archaeoglobales are phylogenetically similar 
to methanogenic archaea and use several of the coen- 
zymes of methanogenesis to oxidize organic sub- 
strates (141). For example, lactate is converted via 
pyruvate to acetyl-CoA, and CO; is coupled to the re- 
duction of ferredoxins. An acetylsynthase/carbon 
monoxide dehydrogenase complex, similar to that 
found in methanogens, cleaves the C-C bond in 
acetyl-CoA generating both an enzyme-bound methyl 
group and bound CO. CO and the methyl-group are 
oxidized to CO, by reactions with coenzymes that are 
characteristic of the methanogens (see Chapter 13). 
However, it is not clear how electrons are transferred 
to the enzymes of the sulfate reduction pathway. 


Hyperthermophilic and Predominantly 
Neutrophilic Crenarchaeota: 
Thermoproteales and Desulfurococcales 


Cultivated crenarchaeota have mainly been iso- 
lated from volcanically heated terrestrial habitats, 
although some have been isolated from marine envi- 
ronments. A hallmark of the hyperthermophilic Cre- 
narchaeota is chemolithoautotrophic growth, which 
makes them important primary producers in these 
environments. Many isolates require elemental sul- 
fur, prompting their description as “sulfur-dependent 
and thermophilic Archaebacteria.” However, several 
heterotrophs and autotrophs have been identified 
that can grow at lower temperatures, indicating that 
this branch of the Crenarchaeota represents diverse 
physiologies. 

The thermophilic/hyperthermophilic Crenarchae- 
ota are divided into three orders: Thermoproteales, 
Desulfurococcales, and Sulfolobales. The thermoaci- 
dophilic Sulfolobales form a phylogenetically coher- 
ent group that is distinct from the Thermoproteales 
and Desulfurococcales (see “Sulfolobus, Acidianus, 
and relatives,” below). In contrast, the description of 
the Thermoproteales and Desulfurococcales is not 
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clear. The Pyrodictiaceae have been described as a 
separate order, “Pyrodictiales,” that is distinct from 
the Desulfurococcales (e.g., 386). However, more re- 
cent descriptions group the genera Pyrodictium, Hy- 
perthermus, and Pyrolobus into one family within the 
Desulfurococcales (155). 

Most of the Desulfurococcales and Thermopro- 
teales are anaerobes, although both taxa include aer- 
obes, such as Aeropyrum and Pyrobaculum (Table 5). 
Anaerobic chemolithoautotrophic growth with H, as 
electron donor and S° as electron acceptor (“S°/H> au- 
totrophy”) (92) is characteristic of many hyper/ther- 
mophilic members of the Crenarchaeota and is prob- 
ably the most important energy-yielding reaction for 
CO, fixation in volcanic hot springs and submarine 
vents. Other electron acceptors include thiosulfate, 
nitrate, nitrite, ferric iron, and sulfate. Many species 
grow with organic substrates, either by fermentation 
(e.g., Hyperthermus) or by anaerobic respiration cou- 
pled to inorganic electron acceptors. Organic nutrients 
are often oxidized completely to CO2, even under 
anaerobic conditions (e.g., Thermoproteus). 


Desulfurococcales 


All Desulfurococcales are hyperthermophiles 
and include all isolates over the past 25 years that are 
able to grow at the highest temperatures: Pyrodic- 
tium, Pyrolobus, and strain 121. Most strains grow 
anaerobically, and a few grow aerobically or faculta- 
tively anaerobically. Energy is gained from the oxi- 
dation of hydrogen under autotrophic conditions us- 
ing elemental sulfur, thiosulfate, nitrate, or nitrite as 
electron acceptors, and CO, as a carbon source. Al- 
ternatively, organotrophic growth can occur by aero- 
bic or anaerobic respiration, or by fermentation of or- 
ganic substrates. The culturable Desulfurococcales 
isolates are divided into two families: the Desulfuro- 
coccaceae include a large number of genera and di- 
verse physiological types, while the Pyrodictiaceae 
contain only three genera (155). Desulfurococcales 
cells are all coccoid or disc shaped and occur singly or 
in aggregates. The diameter of cells is typically 0.5 to 
4 wm. 


Desulfurococcaceae. The Desulfurococcus genus 
includes five formally described species and several 
incompletely described strains that were isolated from 
terrestrial volcanic samples (155). They grow anaero- 
bically with organic substrates (peptides and/or car- 
bohydrates), either by fermentation or sulfur respira- 
tion (Table 5) (155). Similar conditions are required 
for Sulphophobococcus zilligii and Thermosphaera 
aggregans, which grow anaerobically by fermentation 


of complex organic nutrients, when sulfur is absent 
(Table 5). Thermosphaera aggregans was one of the 
first microorganisms to be predicted by environmental 
16S rDNA analysis to exist, prior to being isolated 
from the same environment (158). Acidilobus aceticus 
(pH op 3.8) and the related Caldococcus noboribetus 
(pHopt5 3.0) require complex nutrients for growth, and 
growth is stimulated by the addition of sulfur (155). 
Acetate is one of the main products of fermentation. 
The two isolates cluster together phylogenetically and 
represent a rare example of thermoacidophilic anaer- 
obic crenarchaeota, which are not members of the Sul- 
folobales order (Tables 5 and 8). 

One of the first archaeal introns was identified in 
the 23S rRNA gene of Desulfurococcus mobilis (230) 
(see Chapter 7). After excision, the intron ligates 
to form a stable circular RNA that encodes a site- 
specific endonuclease. Cleavage and exon-splicing re- 
actions resembled those of type II tRNA introns in 
Eucarya (199, 428). 

The marine Desulfurococcaceae comprise the 
genera Aeropyrum, Ignicoccus, Staphylothermus, 
Stetteria, and Thermodiscus. Staphylothermus, 
Thermodiscus, and Stetteria require complex nu- 
trients in addition to S° (155). Staphylothermus de- 
grades proteins and peptides, and possesses a char- 
acteristic S-layer structure with long stalks and 
4-fold symmetry (Fig. 5) (see Chapter 14). Stetteria 
grows mixotrophically with organic nutrients, H3, 
and S° or thiosulfate. 

The two Aeropyrum species differ from other 
members of the Desulfurococcales by being strictly 
aerobic and growing on complex organic nutrients. 
Thiosulfate is likely to be oxidized to sulfate and 
stimulates the growth of Aeropyrum pernix. The 
availability of the genome sequence of A. pernix (the 
first published for a member of the Crenarchaeota) 
(182) underpinned efforts to determine 3D protein 
structures. One of the most interesting outcomes was 
the resolution of a voltage-gated potassium channel 
(KvAP) and the determination of channel’s opening 
mechanism in response to membrane polarization 
(Fig. 19). The channel contains a K* pore that is sur- 
rounded by “sensors” (consisting of helices) near the 
pore, and “voltage-sensor paddles” on the outer 
perimeter of the pore. In the crystal structure of the 
closed channel, the paddles are located near the in- 
tracellular side of the channel. In response to mem- 
brane depolarization, the paddles appear to move 
across the membrane toward the outside and open 
the channel (168, 169, 234). 

In contrast to the members of Desulfurococ- 
caceae described above, the three Ignicoccus species 
grow by $°/H,-autotrophy. I. islandicus and I. pacifi- 
cus are formerly described (Table 5), whereas Igni- 
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Figure 19. (See the separate color insert for the color version of this illustration.) Model of the Aeropyrum voltage-gated 
K*-channel KvAP and comparison with the Streptomyces lividans KcsA K* channel. (A) Stereo view of the KvAP pore with 
electron density map contoured at 1.0 A carbon (yellow), nitrogen (blue), oxygen (red), potassium (green). (B, C) -Carbon 
traces of the KvAP pore (blue) and the Streptomyces lividans KcsA K* channel (green) shown as a side view (B) and end-on 
from the intracellular side (C); $5, S6, outer and inner helices; glycine-gating hinges (red spheres). (D, E) Models of the closed 
(D) and open (E) KvAP structures based on the positions of the paddles (red), the pore and the S§ and S6 helices of KcsA. Re- 
produced with modifications from Nature (168, 169) with permission of the publisher. 
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coccus sp. KINS is not. Strain KIN4/I may be related 
to I. islandicus as it was sourced from the same geo- 
graphical region. Ignicoccus is the only member of 
the Archaea that possesses a true periplasmic space 
formed by an outer membrane, rather than an S layer 
(Fig. 20) (see Chapter 14). The periplasm is much 
wider than in gram-negative bacteria (20 to 400 nm) 
and contains membrane-bound vesicles. The outer 
membrane contains lipids and regularly arranged 
proteins, in addition to pores (24 nm diameter) that 
are surrounded by a ring of regularly arranged parti- 
cles (130 nm diameter). The function of all these 
membrane features is unknown (275, 315). 
Unusually small symbiotic (or parasitic) cells of 
N. equitans were discovered attached to the outer 
membrane of Ignicoccus sp. KIN4/I (Fig. 20) (152). 
Whereas the Ignicoccus strain grew successfully in 
monoculture, it has not been possible to grow N. eq- 
uitans without the host. The circular genome of 
N. equitans (481 kbp) encodes 585 genes and is the 
smallest known genome for a free-living cell (429). 
Seventy-three percent of the genes have homologs in 
GenBank. In contrast, Acanthamoeba polyphaga 
mimivirus has the largest viral genome (linear, 1.19 
Mbp) and encodes 911 genes, for which ~10% have 
homologs in GenBank (110). The Nanoarchaeum 
16S rDNA sequence is significantly different from 
other archaea, and the genus has been placed into a 
novel kingdom, the Nanoarchaeota. The phylogenetic 
placement is unclear (see “Phylogeny of Archaea and 
the origin of life,” above), and it has been argued that 
it might have evolved rapidly and be derived from the 


Euryarchaeota (see Chapter 19). Because of the di- 
vergence of the 16S rDNA of N. equitans, the organ- 
ism was unable to be detected in environmental sam- 
ples with “universal” 16S Archaea primers, but is 
now able to be detected using specific primers. 

The genome of N. equitans has revealed a number 
of interesting evolutionary features. For example, sev- 
eral of the tRNA genes are split and the two gene seg- 
ments are physically separated in the genome (see 
Chapter 7) contributing to the debate concerning 
whether modern tRNAs evolved from split precursors 
(317). It is not clear what benefits the two species, Ig- 
nicoccus and Nanoarchaeum, derive from their asso- 
ciation. For example, it is not known what metabolites 
are transferred between them. The lipid composition 
in the membranes of both species changes in parallel 
when growth temperature is changed, suggesting that 
Nanoarchaeum may utilize the lipid pool from the Ig- 
nicoccus host (164). The discovery of Nanoar- 
chaeum/Ignicoccus has promoted numerous avenues 
for new research into the Archaea. 


Pyrodictiaceae. The Pyrodictiaceae comprise 
three genera with a total of five to six species that 
have all been isolated from marine environments 
(155). They grow optimally above 100°C and in- 
clude the species that grow at the highest tempera- 
tures (Table 5). Pyrodictium occultum, P. brockii, 
and P. abyssi were the first microorganisms found to 
grow optimally above 100°C (Tmax, 110°C) (155, 
389). Pyrodictium cells are disc shaped, flat (0.1 to 
0.2 um), and pleomorphic, similar to Haloferax and 


Figure 20. (See the separate color insert for the color version of this illustration.) Electron micrograph and fluorescence im- 
ages of Ignicoccus and Nanoarchaeum. (A) Transmission electron micrograph of thin-sectioned Ignicoccus cell with broad 
periplasmic space (P) and budded vesicles; OM, outer membrane, C, cytoplasm, bar, 1 wm. (B) Negative stained Ignicoccus 
outer membrane, highlighting power spectra of image field (C to E) (275). Panels A to E reproduced from Biochemical Soci- 
ety Symposia (275) with permission of the publisher. (F) Ultrathin section of Nanoarchaeum cells attached to the outer mem- 
brane of Ignicoccus sp. KIN/4. (G) Platinum shadowing of Ignicoccus cell with several Nanoarchaeum cells attached (left side 
of photograph). (H) Confocal laser-scanning micrograph using Nanoarchaeum (red) and Ignicoccus-specific probes (green). 
Panels F to H reproduced from Nature (152) with permission of the publisher. 
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Haloquadratum (Fig. 4). Pyrodictium grows in hy- 
potonic media, in contrast to the extreme halophiles 
that have a cytoplasm that is isosmotic with the 
medium. It is not clear how Pyrodictium cells main- 
tain their shape with a positive turgor pressure inside 
the cell, because they only appear to have a proteina- 
ceous S layer to physically support the cell. Cells are 
covered with a hexagonal S layer anchored to the 
membrane with stalks (see Chapter 14). The resulting 
pseudoperiplasmic space has a constant width of ~35 
nm. The cells are embedded in a network of hollow 
tubules (cannulae), which have an outer diameter of 
25 nm. The cannulae comprise three similar and 
probably homologous glycoproteins (278). Daughter 
cells remain connected to the mother cell by the can- 
nulae, which are anchored in the S layer. Growing 
cultures form macroscopically visible flocks and 
biofilms. 

Pyrodictium species are anaerobes growing by 
$°/H, autotrophy. S°-reducing and H)-oxidizing 
chemolithoautotrophs require at least two mem- 
brane-bound enzymes for energy conversion: a hy- 
drogenase, and sulfur reductase (SR) or polysulfide 
reductase (PSR; see Fig. 26) that are coupled in a 
short electron transport chain. Both enzymes have 
been characterized from P. brockii (308, 309), and in 
more detail from P. abyssi (71). A 520-kDa mem- 
brane-bound multienzyme complex purified from P. 
abyssi is composed of nine subunits and has hydro- 
genase and SR activities (71). It contains a variety of 
transition metals, heme b and heme c, but no Mo or 
W that are typically present in SR or PSR. The single 
complex contains all the constituents necessary for 
the entire electron transport from H; to S°. 

S° undergoes a transition from crystalline rhom- 
bic a-Sg, or monoclinic B-Sg, to the biologically inac- 
cessible polymeric (liquid) -sulfur at 113 and 119°C, 
respectively. As a result, microorganisms that require 
available S° for growth are limited to temperatures 
below these values. Consistent with this, the two mi- 
croorganisms with the highest Tmax use other electron 
acceptors. Pyrolobus fumarii is a formally described 
species that has a Tmax of 113°C and a Tope of 106°C; 
it is unable to grow at temperatures $90°C (155). Py- 
rodictium and Pyrolobus species have been shown to 
survive autoclaving at 121°C for one hour. Pyrolobus 
grows chemolithoautotrophically with H, as electron 
donor and thiosulfate, nitrate, or a low partial pres- 
sure of oxygen as terminal acceptors, with H3S, am- 
monia, and water as products, respectively. One pub- 
lished report exists for strain 121 (a formal description 
is not available) that describes its growth on formate 
with Fe?* as electron acceptor at a Tmax of 121°C 
and a Top: of 106°C (179). The 16S rDNA sequence 
is most similar to other species of Pyrodictiaceae (95 


to 96%), suggesting that it also belongs to the same 
family. 

The only representative of the third genus of the 
Pyrodictiaceae, Hyperthermus butylicus, differs from 
Pyrodictium and Pyrolobus by being entirely depen- 
dent on complex organic substrates for growth 
and being unable to grow chemolithoautotrophically 
(447). It has a Top: of 103°C and performs mixed bu- 
tyric acid fermentation from amino acids, producing 
large amounts of C4 and C; organic acids by an un- 
known pathway. The annotated genome sequence of 
H. butylicus is being finalized. Presently, it is the mi- 
croorganism with the highest Top: for which genome 
sequence data are available. Its capacity to grow at 
such high temperatures should reveal thermostable 
enzymes of commercial interest (see Chapter 22). In 
addition, H. butylicus is a heterotroph with the abil- 
ity to degrade various organic polymers and therefore 
synthesizes numerous dehydrogenases and esterases, 
which may also be of commercial value (58). 


Thermoproteales 


Members of the Thermoproteales have been iso- 
lated from volcanically heated terrestrial, acidic, and 
neutral springs, soil biotopes, and water and mud 
holes, in addition to Pyrobaculum aerophilum, which 
is the sole marine isolate from a vent in shallow water 
off the coast of Ischia Island, Italy (Table 6) (153). In 
contrast to the Desulfurococcales, which are coccoid 
or pleomorphic, the Thermoproteales form stiff rods 
(Fig. 21). Many strains produce branched and/or 
“golf club”-like structures (stiff rods with a coccoid 
extension or daughter cell at their ends). Most mem- 
bers of the Thermoproteales are anaerobic hyperther- 
mophiles, with the exception of Thermocladium 
modestus, which is a thermophile (T,,, 75°C). The 
two different families of Thermoproteales, the Ther- 
moproteaceae and Thermofilaceae, can be distin- 
guished by morphology. The Thermoproteaceae form 
stiff rods, =0.4 um in diameter. In contrast, the two 
Thermofilaceae species form thin rods (0.15 to 0.35 
um in diameter). 

Thermoproteales grow heterotrophically and/or 
chemolithoautotrophically with a Top: of 75 to 100°C 
(153). Thermoproteus neutrophilus and Pyrobacu- 
lum species have a neutral pH,,, (6 to 7), whereas 
Caldivirga, Vulcanisaeta, and Thermocladium are 
acidophiles (pHopt 3.7 to 4.5), and several others are 
moderate acidophiles (pH,,,, 5 to 6). Growth is usu- 
ally best in media of low ionic strength. Most Ther- 
moproteales grow anaerobically, whereas Pyrobacu- 
lum oguniense grows aerobically, and several other 
species, including P. aerophilum, T. Modestus, and 
Caldivirga maquilingensis, are able to grow at low 
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Figure 21. Electron micrographs of Thermoproteus and Thermofilum. (A, B) Thermoproteus tenax cells with branched form. 
(C, D) Thermofilum pendens with golf club structure. Reproduced from Bergey’s Manual of Systematic Bacteriology (451, 452) 
with permission of the publisher. 
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oxygen tension. Caldivirga is the only known sulfate- 
reducing member of the Crenarchaeota. 

Thermoproteus tenax and P. aerophilum have 
become crenarchaeal model organisms, and their 
genomes have been sequenced. T. tenax grows by 
chemolithoautotrophic oxidation of H3, or chemo- 
organoheterotrophically with S° as electron acceptor. 
It is sufficiently oxygen tolerant to permit cultivation 
without strict anaerobic handling. The enzymology 
and regulation of sugar breakdown, and gluconeoge- 
nesis have been studied extensively in T. tenax (see 
Chapter 12). The related species, Thermoproteus neu- 
trophilus, has been used to study enzymes and mech- 
anisms of autotrophic CO, fixation via the reductive 
citric acid cycle. The same pathway seems to function 
in Pyrobaculum islandicum. In contrast, Pyrodictium 
species appear to utilize ribulose bisphosphate car- 
boxylase (160). Similar to T. tenax, P. aerophilum 
grows facultatively chemolithoautotrophically with 
H, and organic nutrients as electron donors. How- 
ever, it differs from T. tenax by using the electron ac- 
ceptors, nitrate, oxygen, thiosulfate, arsenate, or sele- 
nate. Sulfur inhibits growth of P. aerophilum but not 
the growth of other species of this genus (153). 

The only known archaeal nitrate reducers are 
P. aerophilum, P. fumarii from the Crenarchaeota, 
and F placidus and some species of Haloferax from 
the Euryarchaeota (4, 129, 257). Several metalloen- 
zymes have been isolated from Pyrobaculum spp. The 
membrane-bound dissimilatory nitrate reductase con- 
tains molybdenum, several FeS clusters, and heme b 
and consists of three subunits similar to nitrate re- 
ductases from bacterial mesophiles (4). A nitric ox- 
ide reductase displays menaquinol:NO oxidoreduc- 
tase activity and contains heme and nonheme iron in 
a 2:1 ratio (70). Additional Pyrobaculum enzymes 
that have been purified include a tungsten aldehyde 
oxidoreductase that is similar to the enzymes from 
Pyrococcus and a siroheme sulfite reductase similar to 
an Archaeoglobus enzyme (61, 130). 


Cenarchaeum, Nitrosopumilus, and 
Uncultured Mesophilic Crenarchaeota 


Almost all cultivated members of the Crenar- 
chaeota are hyperthermophiles. However, members 
of the Crenarchaeota have been detected in molecular 
ecology studies from numerous cold or temperate en- 
vironments (reviewed in references 51, 353). Despite 
their ecological relevance, Archaea from cold envi- 
ronments have been significantly understudied. The 
available cultivatable isolates are restricted to six for- 
mally characterized methanogens, one haloarchaeon, 
and one “cold crenarchaeon” (51). In addition, Cen- 
archaeum symbiosum (Crenarchaeota) and “eury- 


archaeon SM1” (Fig. 7) have been studied but are un- 
able to be cultivated as a monoculture (Table 7) (see 
“Cell walls and extracellular structures,” above) (51). 

Symbiotic populations of C. symbiosum are pre- 
sent in the marine sponge Axinella mexicana, and re- 
lated phylotypes have been recovered from other 
sponges in different oceanic regions, illustrating that 
this specific metazoan-archaeal association is wide- 
spread (51). It is difficult to separate the archaeal cells 
from the sponge tissue and bacterial endosymbiots. 
However, sufficient archaeal biomass has been ob- 
tained to enable the construction of a genomic fosmid 
library and to fulfill the requirements for a genome- 
sequencing project of C. symbiosum (http://web.mit 
.edu/esi/html/researchsub/organize.html). It was an 
interesting discovery that Cenarchaeum is not mono- 
clonal in Axinella and is present as a population of 
strains with nucleotide sequence differences =10% 
and with additional rearrangements within coding re- 
gions. C. symbiosum was also the first cold-adapted 
member of the Crenarchaeota to have a functional 
gene (DNA polymerase) expressed in E. coli (356). 

It took considerable effort and patience before 
the first cultivated, free-living member of the Crenar- 
chaeota was obtained from a nonthermophilic envi- 
ronment (212). From the sequence analysis of large 
genomic DNA fragments cloned directly from the en- 
vironment, it was predicted that some members of the 
Crenarchaeota may be able to utilize inorganic nitro- 
gen compounds for energy metabolism (405). Despite 
this prediction, it was a revelation when “Nitro- 
sopumilus maritimus” was isolated and found to 
grow chemolithoautotrophically by nitrification 
(212). It appears to be phylogenetically positioned as 
a novel order of the Crenarchaeota that is related to a 
number of uncultivated organisms from environmen- 
tal samples. Little else has been characterized about 
its cellular properties. 


(Thermo-) Acidophilic Archaea: 
Thermoplasmatales and Sulfolobales 


Solfataric fields, named after the Solfatara 
caldera near Naples, Italy (Fig. 17), are typical habi- 
tats of archaeal thermoacidophiles (Tables 8 and 9). 
Many oxidize sulfur and ISCs for energy conservation 
and are largely responsible for the low pH values ob- 
served in these environments (reviewed in references 
205, 207). Solfataras are typically small terrestrial, 
steam-heated pools of surface water or mud, with cell 
densities sometimes in excess of 1 X 108 ml~! anda 
pH of 1 to 4. Two phylogenetically unrelated groups 
of archaea dominate solfataras. The majority of mi- 
croorganisms are lobed, irregular cocci of 1 to 2 um 
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in diameter, that grow aerobically and/or anaerobi- 
cally at ambient boiling temperature in their natural 
environments and belong to six different genera of the 
obligatory acidophilic Sulfolobales (Table 9) (154). 

In contrast, Thermoplasma and Picrophilus 
(members of the Euryarchaeota) are found in lower- 
temperature pools, and very acidic soils with a low 
water activity, in and around solfataric fields (Table 8) 
(156). Picrophilus spp. are the most acidophilic 
organisms with a pH,,,, of 0.7 at 59°C, and growth 
is possible at pH 0 (354, 355). A comparison of the 
genome sequences of Sulfolobales (three species) and 
Thermoplasmatales (four species) revealed that a sig- 
nificant fraction of the protein-encoding genes from 
Sulfolobus species had their best match with ho- 
mologs in the genomes of Thermoplasmatales species 
(with the exception of homologs from the other 
strains of Sulfolobus) (336). This indicates that a high 
level of gene exchange is occurring under conditions 
that may be expected to rapidly degrade extracellu- 
lar DNA and indicates that genome composition can 
be strongly influenced by the ecosystem in a way that 
is largely independent of an organism’s phylogeny 
(see Chapter 5). 

Ferroplasma acidophilum and F. acidarmanus 
belong to a third genus of the Thermoplasmatales and 
were isolated from acid mine drainage where they 
were found to oxidize a variety of metals for energy 
conservation. They were the first mesophilic archaea 
to be isolated that do not belong to the methanogens 
or haloarchaea (114, 115). 

The prototypes of thermoacidophilic archaea are 
the Thermoplasmatales and the Sulfolobales. This 
phenotypic class is slowly expanding and includes 
isolates from other archaeal lineages, such as anaero- 
bic Acidilobus (Desulfurococcales), and Vulcanisaeta, 
Thermocladium, and Caldivirga (Thermoproteales) 
(312). However, there are few reports that describe 
biochemical or molecular properties of these latter 
thermoacidophiles. 

Sulfolobus solfataricus and S. acidocaldarius 
have proven to be model organisms for studies on 
biochemical pathways, enzymology, cell biology, gene 
regulation, DNA replication, and the developments of 
genetic tools in Archaea. S. acidocaldarius and A. am- 
bivalens have developed into model organisms for 
studying heterotrophic and chemolithoautotrophic 
energy conservation, respectively. Studies on Thermo- 
plasma have advanced the understanding of prote- 
olytic and peptidolytic enzymes (e.g., proteasome). 
Picrophilus provides a model for acidophily, although 
it has recently been reported that intracellular pro- 
teins from Ferroplasma are more adapted to low in- 
tracellular pH than Picrophilus (90) (see “Thermo- 
plasmatales,” next paragraph). 


Thermoplasmatales 


Thermoplasma acidophilum was isolated from 
smoldering, self-igniting coal refuse piles and gained 
considerable attention due to its unusual habitat and 
growth conditions (Table 8) and the fact that it is de- 
void of a cell wall (156). Studies of its membrane 
composition, and subsequent studies of S. acidocal- 
darius, led to the discovery of phytanyl ether lipids 
in thermophilic archaea (227, 228). Thermoplasma 
species have been detected worldwide in moderately 
heated, but always very acidic, pools and soils of sol- 
fataric fields. They thrive aerobically between 40 and 
68°C at pH 0 to 4 on organic, preferentially pro- 
teinaceous nutrients and can grow anaerobically pro- 
vided that S° is present to function as a terminal elec- 
tron acceptor (156). 

In contrast to Thermoplasma, Picrophilus pos- 
sesses a cell wall (Fig. 22), consisting of a proteina- 
ceous S layer with tetragonal symmetry. Picrophilus 
grows aerobically at pH 0-2 on yeast extract (Topi 
60°C), similar to Thermoplasma, but it is unable to 
grow anaerobically. The cytoplasmic pH for Pi- 
crophilus is 4.5 (pH 5.6 for Ferroplasma), which is 
the lowest measured for a living cell (113, 413). Some 
low-molecular-mass compounds are very unstable un- 
der these conditions and rapidly degrade. For exam- 
ple, the in vitro half-life for NADPH is only 2.4 min 
at 60°C and 1.7 min at 65°C (10). Because of the pH 
of the environment and cytoplasm, Picrophilus en- 
zymes may be expected to be adapted to low pH. 
However, unlike the secreted proteins, the pH,, of 
heterologously produced enzymes is near neutrality 
for most of the cytoplasmic enzymes, although it 
should be noted that only a limited number of Pi- 
crophilus enzymes have been studied. 

The genus Ferroplasma comprises slow-growing, 
iron-oxidizing archaea that are devoid of a cell wall 
(Fig. 22) and require large amounts of iron for en- 
ergy conservation when growing chemolithoau- 
totrophically (12 mg of protein g~' Fe**) (114). 
E acidophilum was isolated from a metal-bioleaching 
pilot plant, and F. acidarmanus was isolated from 
Richmond Mine at Iron Mountain, Calif. (Table 8) 
(156). Ferroplasma species have a pH,» of 1.3 to 1.7 
and a Top: of 35 to 42°C and grow also anaerobically 
with organic nutrients, provided that Fe** is supplied 
as electron acceptor. Ferroplasma is highly tolerant to 
potentially toxic (transition) metals, consistent with 
the presence of these compounds in bioleaching 
plants and acid mine effluents; the concentration of 
Fe is 28 to 111 g liter~!, and the concentrations of 
Zn, Cu, Cd, and As ions exceed 2.5 g liter~!, 380 mg 
liter~!, 250 mg liter~! and 53 mg liter~ t, respectively, 
at Iron Mountain (76, 77). 


62 KLETZIN 


Figure 22. Morphology and ultrastructure of Picrophilus torridus and Ferroplasma acidiphilum. (I) Picrophilus torridus: (a) Pt- 
shadowed P. torridus showing the outline of the cell shape; (b) ultrathin section of cytoplasmic membrane (cm) and S layer 
(sl); (c) negatively stained individual cell; (d) ultrathin section showing pleomorphic morphology. (II) Ferroplasma acidiphilum: 
(a) ultrathin section of F. acidiphilum showing irregular cell shape; chr, translucent chromatoid region; (b) enlargement of the 
cell envelope with cytoplasmic membrane (cm) and no cell wall; (c) Pt-shadowed cells showing cytoplasmic lobes (cl) pro- 
truding from the surface. Reproduced from Environmental Microbiology (115) with permission from the publisher. 


Genome sequence data of Ferroplasma spp. were 
highly represented in the first environmental genome 
project (metagenome) that was conducted on DNA 
sourced from a biofilm taken from acid mine drainage 
effluents at the Richmond Mine (408). From the 
76 Mb of sequence data, the genomes of the domi- 
nant bacterium, a Leptospirillum group II species, 
and the dominant archaeon, Ferroplasma sp. type Il, 
were largely reconstructed. Large scaffolds were also 
obtained for another Ferroplasma strain, which is 
closely related to F. acidarmanus Fer 1 (type I), an un- 
cultivated archaeon (G-plasma) (408) that belongs to 
a novel genus within the Thermoplasmatales, and 
other bacteria, including another Leptospirillum 
(group III) and a Sulfobacillus species. The recon- 
struction of cellular metabolisms found a consider- 
able degree of specialization. For example, Leptospir- 
illum group III appeared to be the only organism 
group in the community that possessed nitrogen fix- 
ation (nif) genes. 

Recently, a novel membrane-bound a-glucosi- 
dase from Ferroplasma acidophilum was found to 
have an acidic pHopt and required iron for activity 


(90). Three other cytoplasmic a-glucosidases and a 
carboxylesterase had similar properties (113). The 
acidic pHopt of enzymes compared with a more neu- 
tral pH (5.6) of the cytoplasm indicates that low-pH 
cellular compartments may exist in the organism. 
The iron dependence of enzymes may indicate that 
iron is not fully excluded from the cytoplasm and the 
enzymes have evolved a requirement for metal ions. 
Despite its relatively slow growth rate and the low 
fermentation yields, Ferroplasma represents a useful 
model for studying adaptation to acidophily. 
Thermoplasma acidophilum has served as an im- 
portant model organism for studying proteolysis and 
the consecutive action of proteases and peptidases 
(258). The proteasome was discovered in T. aci- 
dophilum and found to be similar to the eucaryal 
counterpart (see Chapter 10). The proteasome plays a 
central role in the degradation of misfolded and in- 
active (ubiquitinylated in the case of Eucarya) cellular 
proteins. The architecture of the proteasome was elu- 
cidated by X-ray crystallography using the 20S pro- 
teasome from T. acidophilum and subsequently from 
yeast (127, 245). The cylinder-shaped core enzyme is 
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composed of 28 protein subunits that are arranged 
in four stacked rings of seven subunits each, which 
form an elongated cylinder with three large cavities 
connected by narrow constrictions (Fig. 23) (126). 
The four rings of the Thermoplasma enzyme are 
made from two subunits in an a7B7B7a7stoichiome- 
try, whereas the eucaryal enzymes consist of up to 
14 different subunits in the same three-dimensional 
arrangement. The active site of the 700-kDa particle 
lies in the central chamber of the interior. The pro- 
teasome associates with AAA “unfoldases” (protea- 
some-activating nucleotidases) and catalyzes an en- 
ergy-dependent degradation of proteins into fragments 
of 3 to 30 amino acids in length. The actual substrate 
for the archaeal proteasomes is not known, but there 
are indications that phosphorylated proteins might be 
the targets (no ubiquitin homolog has been unequiv- 
ocally identified in Archaea) (30, 258). 

The peptide products of proteasomes are further 
degraded by another large complex. This complex 
consists of several interacting enzymes whose frame- 
work is provided by the “tricorn peptidase” (named 
for its tricornlike shape) and was also purified and 
crystallized from T. acidophilum. It consists of multi- 
ple copies of a single, 120-kDa polypeptide chain 
(Fig. 23). Six subunits form hexamers that further 


Figure 23. (See the separate color insert for the color version of this 
illustration.) Three-dimensional structures of the T. acidophilum 
proteasome and tricorn protease. (A) Side view of the 26S protea- 
some/activator particle with the two sets of seven terminal PA26 
subunits and the two a7B787a7 rings (PDB code 1YA7) (93). (B) 
Top view of the 20S proteasome core particle showing the seven- 
fold symmetry (PDB code 1PMA) (245). (C) Top view of the ho- 
mohexameric tricorn protease complexed with a tridecameric pep- 
tide derivative (PDB code 1N6E) (195). 


assemble into an icosahedral capsid with a molecular 
mass of 14.6 MDa. The crystal structure of the 720- 
kDa hexamer shows that each monomer consists of 
five separate domains, which may function by coor- 
dinating the steps of substrate channeling to the 
active site (126). The peptidase digests oligomeric 
peptides to tri- and dipeptides. These are then se- 
quentially degraded to free amino acids by peptidases 
(tricorn interacting factors F1, F2, and F3) that are 
small enough to fill the open spaces in the tricorn 
assembly; the partial structures of these have been 
generated (219). Thus, protein unfoldases, protea- 
somes, the tricorn protease, and its interacting factors 
form a supramolecular protein degradation machin- 
ery that processes proteins from their folded state 
through to single amino acids that can be reused for 
cellular metabolism (37, 126). 

The tricorn protease is not widely distributed in 
Archaea, and different peptidases are present for pep- 
tide breakdown. A tetrahedral-shaped, dodecameric 
480-kDa aminopeptidase (TET) has been crystallized 
from P. horikoshii, and TET has also been purified 
from H. marismortui (37, 338). This self-compart- 
mentalizing complex degrades some proteins and 
most peptides down to amino acids. TET contains a 
binuclear zinc active center, and proteolysis occurs in- 
dependently of NTP hydrolysis. The pores for sub- 
strate access have a maximal diameter of 10 A, which 
allows only small peptides to enter the active site. 
This protease has homologs in many archaea and 
bacteria (see Chapter 10). 


Sulfolobus, Acidianus, and relatives 


Sulfolobus acidocaldarius was the first hyper- 
thermophile isolated (44). Cells are irregular, lobed, 
and sometimes motile cocci occur singly with cell di- 
ameters of 1 to 2 um (Fig. 24). It grows aerobically 
either by the chemolithoautotrophic oxidation of S°, 
thiosulfate, sulfidic ores, or H3, or heterotrophically 
with various organic substrates. The cell morphology 
of the five to six genera of Sulfolobales is relatively 
uniform (Fig. 24; Table 9), and most cells in samples 
of acidic solfataras (as well as isolates) are micro- 
scopically indistinguishable (154). 

The Sulfolobales genera tend to be distinguished 
by physiological properties rather than by phyloge- 
netic differences in their 16S rDNA sequences (re- 
viewed in reference 154). All Sulfolobales have a Topi 
of 65 to 95°C and pH, of 2 to 4. The species of the 
genus Sulfolobus are characterized by aerobic growth 
and a high metabolic versatility. Most Sulfolobus 
strains, including the model organisms S. acidocal- 
darius, S. solfataricus, and S. tokodaii, can be routinely 
grown with peptides and/or sugars as carbon and 
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Figure 24. Electron micrographs of Acidianus ambivalens. (Left) Phase-contrast micrograph; bar 10 ym. Photograph cour- 
tesy of K. Lauber, Darmstadt, Germany. (Right) Transmission electron micrograph of thin section; bar 1 wm. Photograph cour- 


tesy of W. Zillig, Martinsried, Germany. 


energy sources at pH 3 and 75 to 85°C (Table 9). S. 
metallicus is a facultatively chemolithoautotrophic, 
sulfur and metal sulfide-oxidizing aerobe that is fre- 
quently found in metal-rich, self-heating mine heaps 
and bioleaching operations. These growth properties 
are atypical for the genus, but not more generally for 
the Sulfolobales order. 

Stygiolobus species are anaerobic chemolithoau- 
totrophic sulfur reducers. The genus Acidianus (Fig. 24) 
comprises several chemolithoautotrophic and faculta- 
tively anaerobic and physiologically versatile species. 
The genus was initially characterized as growing aer- 
obically by S° oxidation to sulfuric acid, or anaero- 
bically by H, oxidation with S° as the electron accep- 
tor. In addition to these metabolic properties, Acidianus 
species grow by H, oxidation with O> (Knallgas re- 
action), anaerobic respiration with various electron 
acceptors (e.g., molybdate, Fe**, or arsenate), or ox- 
idation of tetrathionate and metal sulfides (e.g., 
pyrite, chalcopyrite, and sphalerite), which causes bi- 
oleaching (339). Some Acidianus strains are faculta- 
tive heterotrophs. The bioleaching species of Acidi- 
anus, Sulfolobus metallicus, and Metallosphaera 
contribute to base and precious metal extraction and 
to the formation of acidic drainage downstream of 
mines and slag heaps (154). Metallosphaera spp. are 
strict aerobes and metal leachers, and they grow at 
more moderate temperatures (Top 65 to 72°C) simi- 
lar to Acidianus brierleyi and Sulfolobus thuringien- 
sis. The genus Sulfurisphaera comprises facultatively 
anaerobic strains that are also facultative heterotrophs. 


A sixth genus, Sulfurococcus, has been described, al- 
though the cultures were never deposited in a culture 
collection and are probably no longer available. 

Many Sulfolobales isolates were termed “Sul- 
folobus,” although subsequent nucleic acid hy- 
bridization and 16S rDNA sequencing suggested that 
they were distantly related to each other and should 
be phylogenetically reclassified (Fig. 25) (154). For 
example, Sulfurisphaera ohwakuensis, Sulfolobus 
tokodaii, and S. yangmingiensis have high 16S rDNA 
nucleotide identity (=99%) and should probably be 
considered a single genus or species (154), while the 
similarity between S. tokodaii and S. solfataricus is 
only ~90%, suggesting that they might belong to sep- 
arate genera (Fig. 25). Several type strains (e.g., S. aci- 
docaldarius and S. solfataricus) were unable to oxi- 
dize S? a number of years after they were isolated 
(284), a property that had been demonstrated (367) 
and used to originally define Sulfolobus (44, 455). 
Most of the early isolates were purified from natural 
samples by successive rounds of serial dilution and 
not by plating (44, 455) and were likely to have in- 
cluded cocultures of heterotrophs and autotrophs. 
This is no longer an issue for pure cultures that have 
been purified by using improved plating techniques 
for hyperthermophiles (e.g., 450). 

Compared with many archaea, the heterotrophic 
Sulfolobales strains are easy to handle and cultivate 
and have led to the generation of systems for genetic 
manipulation (e.g., 5) and made them model organ- 
isms for studying many cellular processes. Numerous 
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Figure 25. Phylogenetic dendrogram of the Sulfolobales based on 16S rDNA sequences. Scale bar, 10 estimated exchanges 
within 100 nucleotides. Reproduced with minor modifications from The Prokaryotes (154) with permission of the publisher. 


3D structures for proteins from Sulfolobales are in 
the Protein Data Bank. Chemolithoautotrophic Sul- 
folobales are more difficult to grow on plates because 
they grow slowly and S° needs to be precipitated on 
solid media, or colloidal sulfur used (450). Complete 
genome sequences are available for S. solfataricus, 
S. tokodaii, and S. acidocaldarius. 


Sulfur metabolism in A. ambivalens 


Oxidation and reduction of sulfur and inorganic 
sulfur compounds is a characteristic often associated 
with the Archaea. The reduction of S° is a common 
physiological property (Fig. 26) (7). However, sulfur 
oxidation is primarily a feature of acidophiles as the 
oxidation product causes a marked decrease in the pH 
of the environment. Acidianus species can do both; 
depending on the culture conditions, they reduce S° 
to hydrogen sulfide or oxidize S° to sulfuric acid. 


Sulfur reduction in A. ambivalens. Sulfur reduc- 
tion in anaerobically grown A. ambivalens requires 
sulfur reductase (SR) and hydrogenase activities com- 
parable to those of Pyrodictium spp. (231) (see “Pyro- 
dictiaceae” above). The two enzymes have been pu- 
rified. The hydrogenase gene is present in a multigene 
cluster that includes genes for a NiFe and an FeS sub- 
unit. The subunits are not very similar to other hy- 
drogenases (highest pairwise similarity, 40%) (231). 
The SR operon consists of a five-gene cluster, sre- 
ABCDE. The deduced amino acid sequences show 
similarity to molybdoenzymes of the dimethyl sulfox- 
ide/nitrate reductase family (231), and molybdenum 


was found in solubilized membrane fractions, suggest- 
ing that the SR is a molybdoprotein. The molecular 
composition of the A. ambivalens SR is similar to the 
Wolinella succinogenes polysulfide reductase. Both en- 
zymes consist of homologous catalytic and electron 
transfer subunits presumably oriented toward the 
(pseudo-) periplasm, and nonhomologous membrane 
anchors (231). It is assumed that an electrochemical 
gradient is generated via a Q-cycle mechanism, similar 
to bc1 complexes of the respiratory chain (“Q cycle” 
describes a mechanism where reduction of the quinone 
leads to proton uptake in the cytoplasm, and reoxi- 
dation of the quinone leads to release of protons into 
the periplasm) (406) (see “Energy metabolism: aerobic 
electron transport chains” below). 

A gene cluster that is similar to A. ambivalens 
sreABCDE is present in the genome of S. solfatari- 
cus (but not in S. acidocaldarius or S. tokodaii), sug- 
gesting that it should have the capacity to grow by 
heterotrophic anaerobic S° respiration, but attempts 
to demonstrate this have been unsuccessful so far 
(A. Kletzin, unpublished results). None of the three 
Sulfolobus genomes contain hydrogenase genes, in- 
dicating that they are not expected to be able to grow 
lithotrophically with H,-like A. ambivalens. 


Sulfur oxidation in A. ambivalens. The ability to 
aerobically or anaerobically oxidize S° and ISCs is 
widespread in the microbial world (Fig. 26). Bacterial 
sulfur oxidizers are physiologically and phylogeneti- 
cally diverse. In contrast, only a few archaeal sulfur 
oxidizers are known, and these are members of the 
Sulfolobales. The biochemistry and electron transport 
chains of S° oxidation was determined primarily using 
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Figure 26. The bioinorganic sulfur cycle. Sulfur cycle depicted with 
enzyme reactions involving sulfur compounds. Enzymes: 1, poly- 
sulfide reductase; 2, sulfur reductase; 3, sulfide:quinone oxido- 
reductase or sulfide:cytochrome c oxidoreductase; 4, sulfur oxyge- 
nase; 5, sulfite:acceptor oxidoreductase (the acceptor is cytochrome 
c in most bacteria) or sulfite oxidase; 6, ATP sulfurylase or adeny- 
lylsulfate:phosphate adenylyltransferase (APAT); 7, ATP sulfury- 
lase; 8, adenylylsulfate (APS) reductase; 9, sulfite reductase; 10, 
tetrathionate reductase; 11, thiosulfate:acceptor oxidoreductase; 
12, Sox complex; 13, sulfur oxygenase reductase; 14, thiosulfate 
reductase; 15, tetrathionate hydrolase (there is also a trithionate 
hydrolase in addition); 16, O-acetylserin or O-phosphoserine 
sulfhydrolases; 17, cysteine desulfurase; 18, APS kinase; 19, 3’- 
phosphoadenylylsulfate (PAPS) reductase. Gray circles denote dis- 
proportionation reactions. Reactions of cysteine breakdown are 
omitted. Compiled from several sources (205, 214, 401) with per- 
mission of the publishers. 


Acidianus species (205, 207) and found to differ sig- 
nificantly from the bacterial models. 

The oxidation of S° to sulfuric acid proceeds in 
at least two steps and involves intermediates such as 
sulfite, thiosulfate, and tetrathionate (Fig. 26) (e.g., 
reviewed in references 100, 101, 186, 401). (Thermo-) 
acidophilic archaea oxidize S° with a soluble sulfur 


oxygenase reductase (86, 139, 204). Thiosulfate and 
sulfite are oxidized by membrane-bound oxidoreduc- 
tases (271, 456). In contrast, neutrophilic bacteria ox- 
idize S° and ISCs with a periplasmic Sox multienzyme 
complex (100, 101). Sox genes are present in genomes 
of numerous mesophilic and thermophilic bacteria, 
but not in archaea (100). 

The sulfur oxygenase reductase (SOR), which 
performs the initial step in the S° oxidation pathway 
of aerobic archaea, is unique. It produces sulfite, thio- 
sulfate, and hydrogen sulfide in an oxygen-dependent 
sulfur disproportionation reaction when incubated at 
elevated temperatures with S°. Oxygen, but no other 
cofactors, is required for activity (86, 139, 204): 


4 S° + O, + 4H,O — 2 HSO,7 (4) 
+2 HS +2H* 

Thiosulfate is probably a nonenzymic product of sul- 
fite condensation with excess S°. The SOR was puri- 
fied from A. ambivalens, and the sor gene was ex- 
pressed in E. coli to produce an active enzyme (409). 
Other SORs have been purified from A. brierleyi and 
A. tengchongensis (86, 139). Other sor genes are pre- 
sent in the genomes of S. tokodaii, F. acidarmanus, Pi- 
crophilus acidarmanus, and the hyperthermophilic 
bacterium, Aquifex aeolicus (409), but absent from 
S. solfataricus and S. acidocaldarius. 

Three conserved cysteine residues are present in 
SOR enzymes. One of the Cys residues (Cys31 in 
A. ambivalens) was shown by site-directed mutagen- 
esis to be essential, while mutagenesis of the other 
two Cys residues reduced specific activity (54, 412). 
EPR spectroscopy and redox titration showed that 
the SOR contains a mononuclear nonheme iron cen- 
ter in the high-spin Fe** state and with a low mid- 
point potential (E{, —268 mV). The reduction poten- 
tial was more than 300 mV lower than typically 
found for this type of iron center, and low enough to 
explain the $°-reducing activity of the enzyme (EZ 
[H,S/S°] = —270 mV) (409). 

The SOR crystal structure determined to 1.7 A 
resolution shows that the enzyme is a spherical ho- 
moicosatetramer (i.e., 24 subunits) with an external 
diameter of 150 A (410, 411). It surrounds an empty 
cavity with a diameter of 71 to 107 A (Fig. 27). A 
bidentate glutamate, two histidine ligands, and two 
water molecules coordinate the iron center in a “2- 
His 1-Carboxylate facial triad” structural motif (59, 
411). Mutation of any of the three iron ligands re- 
sulted in loss of activity and iron-binding capabilities 
(412). Residue Cys31 has a persulfide modification. 
The iron center and the three Cys residues are buried 
in a pocket in the interior of each monomer and are 
accessible only from the interior cavity. The core 
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Figure 27. (See the separate color insert for the color version of this illustration.) Three-dimensional structure of the A. am- 
bivalens sulfur oxygenase reductase. (A) The SOR holoenzyme. Cartoon representation viewed along the crystallographic four- 
fold axis; cyan, a-helices; purple, B-sheets; red spheres, Fe ions. (B) Molecular accessible surface representation in the same ori- 
entation of inner surface of the sphere, color-coded according to the calculated electrostatic potentials: red, <—10 + 1 kT/e; 
white, neutral; blue, =+10 +1 xT/e. (C) Cavity surface representation of the catalytic pocket, with conserved cysteines and 
iron highlighted; gray arrow, cavity entrance. (D) Effect of mutants on SOR activity; +, zero activity; |, reduced activity; 4, 
strongly reduced activity. The core active site composed of the Fe site and the persulfide-modified Css31 is highlighted within 
ellipsoids. Reproduced with minor modifications from Science (411) with permission from the publisher. 


active site is composed of the iron site and the modi- 
fied Cys31 (Fig. 27). Substrate entry proceeds 
through the hydrophobic channels along the 4-fold 
axes of the sphere. The SOR thus provides an en- 
closed reaction compartment separated from the cy- 
toplasm. The presence of a persulfide shows that S° 
is likely to be covalently bound to Cys31. The linear 
sulfur chain is aligned to the iron site and replaces the 
water ligands, poising the iron site for dioxygen bind- 
ing and activation (411). 


HLS, sulfite, and thiosulfate oxidation. The oxida- 
tion of sulfite, sulfide, and thiosulfate in A. am- 
bivalens requires membrane-bound, proton-pumping 
oxidoreductases since the SOR does not couple S? ox- 
idation to electron transport or substrate-level phos- 
phorylation. Homologs of the sqr gene are present in 
many archaeal genomes; however, H2S oxidoreduc- 
tase activity (sulfide:quinone oxidoreductase) (402) 
has not been experimentally detected yet, despite con- 
siderable effort (A. Kletzin, unpublished results). 
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A tetrathionate-forming, membrane-bound thio- 
sulfate:quinone oxidoreductase (TQO) was purified 
from A. ambivalens. Oxygen electrode measurements 
showed that electrons were transported from thiosul- 
fate to molecular oxygen via the terminal heme copper 
quinol:oxygen oxidoreductase. The 102-kDa holo- 
enzyme consists of two subunits in an 0285 stoichiom- 
etry (DoxDA). The fate of the tetrathionate is not 
clear. There is a possibility, however, that a thiosul- 
fate/tetrathionate cycle exists (271). Tetrathionate is 
unstable in the presence of strong reductants and is re- 
duced to thiosulfate in vitro at high temperatures. HS 
and sulfite might re-reduce tetrathionate formed by 
the TQO and thus feed electrons indirectly from the S? 
disproportionation reaction catalyzed by the SOR into 
the quinone pool. 


Energy metabolism: aerobic electron 
transport chains 


Aerobic electron transport chains have been 
studied primarily in Crenarchaeota (S. acidocaldar- 
ius and A. ambivalens and, more recently, A. pernix), 
to a lesser extent in Euryarchaeota (various Halobac- 
teriales, T. acidophilum (301). The prototypical aer- 
obic electron transfer chains from mitochondria and 
bacteria (e.g., Paracoccus denitrificans) consist of 
complexes I to IV (Fig. 28). In contrast, electron 
transport chains in most other bacteria and archaea 


Complex | 


are composed of multiple heterologous and often re- 
dundant complexes, which are expressed under 
growth-dependent conditions. 

The components of aerobic electron transport 
chains of Archaea are described below according to 
their similarity to the canonical complexes I to IV and 
their associated functions. The components of elec- 
tron transport for oxygen reduction and chemo- 
lithoautotrophic aerobic sulfur oxidation are linked 
in A. ambivalens (see “Sulfur oxidation in A. am- 
bivalens,” above). ATP synthases are not described 
and have been reviewed recently (128, 272). 


I. The rotenone-sensitive NADH:quinone oxi- 
doreductase (NQO or complex I) accepts electrons 
from NADH and reduces quinone, thereby pumping 
protons (e.g., 406). Complex I consists of up to 42 sub- 
units in Eucarya and 14 in Bacteria (termed either 
NuoA-N or Nqo1-14) (344). It is L-shaped in electron 
micrographs with a large cytoplasmic and transmem- 
brane domains. Nine iron-sulfur clusters are present 
in the cytoplasmic domain (Fig. 28) (148). NADH-ox- 
idizing versions of complex I with identical subunit 
composition and cofactor specificity to bacterial or eu- 
caryal complexes have not been found in Archaea. 
However, the proton-pumping F420H> dehydrogenases 
from M. barkeri and A. fulgidus are functionally very 
similar (see Chapter 13). The archaeal enzymes are 


Figure 28. (See the separate color insert for the color version of this illustration.) Canonical respiratory chain in bacteria and 
mitochondria. Scheme based on 3D structures with the exception of the membrane domain of complex I, for which a struc- 
ture is not available. Domains that have not been identified in Archaea are shown in black. PP, periplasm; CM, cytoplasmic 
membrane; CP, cytoplasm; Q, quinols/quinones. The figure was prepared from the coordinates of PDB entries 1FUG (com- 
plex I, Thermus thermophilus), 1NEK (complex II, E. coli), 1KYO (complex II, Saccharomyces cerevisiae), 1EHK (complex 
IV, Thermus thermophilus), and 2CCY (cytochrome c, Rhodospirillum molischianum). 
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composed of 11 subunits that are homologous to sub- 
units from the bacterial NQO (Nqo 4 to 14; Fig. 29). 
Two additional subunits, FpoO and Fpof, are re- 
quired to oxidize the substrate F429 and to replace the 
bacterial equivalent, NADH-accepting subunits (Ngo 
1 to 3) that are missing in the archaeal enzymes. The 
Ech hydrogenases from several methanogens also 
pump protons. However, while they are homologous 
subunits and are considered to be precursors of com- 
plex I, they serve a different function in methanogens 
(140). Other aerobic archaea, including P. aerophilum, 
A. pernix, and the Thermoplasmatales encode the 
same 11 subunits as Methanosarcina. However, these 
complexes have not been studied and the electron 
donors are not known. 

The alternative, “rotenone-insensitive” type II 
NADH dehydrogenase (NDH-2) that is present in 
many bacteria and eucarya was purified from the 
membrane fraction of A. ambivalens (117). The gene 
is present in genome sequences from the Sulfolobales, 
Thermoplasmatales, and H. salinarum. The single-sub- 
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unit flavoproteins have the same enzymatic activity as 
complex I but are unable to pump protons. Trans- 
membrane helices or extensive hydrophobic regions 
are not predicted from the sequence of the archaeal 
NDH-2, and it is speculated to be embedded at the 
membrane surface by amphipathic helices rather than 
being an integral membrane protein (16). Consistent 
with its expected role as an electron transport chain, 
the A. ambivalens type II NADH dehydrogenase and 
the cytochrome aa; terminal oxidase were reconsti- 
tuted with caldariella quinone in liposomes and 
showed NADH-dependent oxygen consumption (117). 

II. Succinate:quinone oxidoreductases (SQOs; 
Fig. 28) are membrane-bound enzymes that catalyze 
the oxidation of succinate to fumarate with quinone 
as the electron acceptor while the related, paralogous 
fumarate reductases (FRDs) perform the reverse re- 
action (224). All SQOs and FRDs have a flavin-con- 
taining, large catalytic subunit (SqoA/FrdA) and a 
smaller iron-sulfur protein (SqoB/FrdB). Most of the 
enzymes contain one 4Fe-4S, 3Fe-4S, and 2Fe-2S 
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Figure 29. Hypothetical scheme of the modular evolution of complex I from an ancestral hydrogenase. Bacterial, archaeal, 
and cyanobacterial complexes emerged by acquisition of specific modules. Dark gray, hydrogenase module; light gray, trans- 
porter module. Reproduced and modified from the Journal of Bioenergetics and Biomembranes (102) with permission of the 


publisher. 
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cluster each in the SqoB subunit. SQOs fall into five 
families (A to E) differing from each other in the 
number, hydrophobicity, similarity, and heme content 
of the remaining one to two subunit(s) that serve as 
membrane anchors and quinone-binding site. For ex- 
ample, in the mitochondrial family C, SQRs, SqoC, 
and D are transmembrane proteins with one bound 
heme b. 

Many archaeal genomes contain single or multi- 
ple copies of genes encoding SQO/FRD family pro- 
teins that have significant similarity to bacterial and 
eucaryal counterparts (349). The Sulfolobales are the 
only archaea to harbor family E enzymes, which are 
significantly different from standard SQOs. The four- 
subunit proteins were purified from membrane frac- 
tion, but the operons lack a gene encoding an obvious 
transmembrane subunit. Instead, they contain hy- 
drophilic, putative anchor proteins with amphipathic 
helices (SqoE and F), suggesting that the complex 
“swims” in the membrane (236). A cysteine-rich, pu- 
tative FeS subunit (SqoE) that is related to het- 
erodisulfide reductases has paralogs in many micro- 
organisms (163). SqoE may be an electron transfer 
protein with multiple functions. Unlike other SqoB 
proteins in other families, SqoB in family E has two 
4Fe clusters instead of one 4Fe and one 3Fe (119, 
165). Only SqoA is similar in all SQO families. 

Ill. The bc1 complex (Ubiquinole:cytochrome c 
oxidoreductase) couples electron transfer to proton 
pumping by a Q-cycle mechanism (Fig. 28) (344). 
The bc1 complexes consist of several subunits, in- 
cluding an integral membrane protein with two b-type 
hemes, and two membrane-anchored proteins with 
one heme c1 moiety and a Rieske iron-sulfur cluster, 
respectively. The brightly colored Rieske proteins 
contain an unusual 2Fe-2S cluster, in which one iron 
ion is coordinated to the protein by two nitrogen lig- 
ands from histidine residues and the other by two sul- 
fur ligands from cysteine residues. 

Canonical bc1 complexes have not been found in 
Archaea, with the exception of a haloarchaeal ana- 
log that has been reported to be enriched but not yet 
purified (379). Analogous complexes containing mul- 
tiple hemes but lacking the c1 component were 
found, for example, in whole cell EPR studies of S 
metallicus (118) and in other Sulfolobales. This was 
inferred from findings that CbsAB from membranes 
of S. acidocaldarius is a highly glycosylated cyto- 
chrome b (145) and that two Rieske proteins were 
present, SoxF and SoxL (147, 358). It was subse- 
quently found that SoxL and CbsAB were encoded 
in the same pentacistronic operon (cbsAB-soxLN- 
odsN) containing two b-type cytochromes (CbsA and 
SoxN), one Rieske protein (SoxL), and two proteins 
of unknown function. Homologous genes are present 


in Metallosphaera and A. ambivalens. The functions 
of this novel complex and its electron acceptors are 
not known. A distinct S. acidocaldarius complex that 
is analogous to bc1 complexes is described below 
(complex IV). 

IV. Oxygen reductases (terminal oxidases) cat- 
alyze the reduction of dioxygen to water with a vari- 
ety of electron donors that include quinoles, reduced 
small electron transfer proteins (e.g., cytochrome c), 
copper-containing azurin, and high-potential, iron- 
sulfur proteins (Fig. 28) (301, 344). The prototypes of 
oxygen reductases are the cytochrome aa3 complexes 
of the respiratory chain of mitochondria and bacteria. 
They are heme copper enzymes, characterized by the 
presence of a binuclear center at the active site con- 
sisting of a heme iron (a3) and a copper ion (CuB), 
which together bind and reduce dioxygen coupled to 
proton uptake from the cytoplasm (69). A second 
heme a present in most oxygen reductases is involved 
in electron transfer toward the active site, and a sec- 
ond, binuclear copper site (CuA) is present in the cy- 
tochrome c oxidases. Terminal oxidases pump pro- 
tons via a proton-conducting channel that is distinct 
from the proton uptake channel, resulting in the elec- 
trochemical gradient being generated by two sepa- 
rate mechanisms. 

Two different oxygen reductase complexes have 
been isolated from S. acidocaldarius. Both pump pro- 
tons, and both act as quinol oxidases. One is com- 
posed of four subunits encoded by genes in the sox- 
ABCD operon (112). Its composition is similar to 
that of bacterial quinol oxidases. The SoxB subunit 
contains the typical cytochrome aa; diheme/copper 
active site. SoxA is homologous to the copper A-con- 
taining subunit of cytochrome c oxidases but lacks 
the copper site consistent with its function as a quinol 
oxidase. SoxC is a diheme cytochrome a subunit with 
an unknown function. 

The second oxygen reductase (SOX-M) forms a 
larger complex and is encoded by the soxMHGFEI 
operon (211). SOX-M activity is blocked by cyanide 
and strongly inhibited by aurachin C and D deriva- 
tives, suggesting that the complex has two proton- 
pumping sites (211, 301). The complex is unusual be- 
cause it combines an oxygen reductase with a bc1 
analogue. SOX-M consists of two subcomplexes. The 
oxygen reductase subcomplex is a bb3-type terminal 
oxidase containing CuA and CuB. The oxygen-reduc- 
ing subunit SoxM contains CuB and one b and one 
b3-type heme. The CuA-containing SoxH subunit is 
similar to mitochondrial cytochrome c oxidases. The 
bc1-like subcomplex consists of a Rieske FeS-protein 
(SoxF) and a homolog of cytochrome b that hosts 
two a-type hemes (SoxG). The blue copper protein 
sulfocyanin (SoxE) functionally links the two sub- 
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complexes (211). With these properties, SOX-M dif- 
fers markedly from the SoxABCD complex (301).A 
high-resolution 3D structure (1.1 A) is available of 
the soluble domain of the SoxF Rieske protein (35). 

The proton-pumping, cytochrome aa3 quinol 
oxidase extracted from membranes of aerobically 
grown A. ambivalens has little similarity to the S. aci- 
docaldarius complexes (116). The enzyme was com- 
posed of five subunits encoded in two separate oper- 
ons (doxBCEF and doxDA) (314). DoxDA was later 
identified as the organism’s thiosulfate:quinone oxi- 
doreductase (see “H3S, sulfite, and thiosulfate oxida- 
tion,” above) (271). The doxB gene encodes the aa3- 
type heme-containing active subunit, while doxC 
encodes subunit I, which does not possess a CuA 
center. 

Other respiratory chain components have been 
isolated from a variety of archaea, including b-type 
cytochromes from H. salinarum (132), two oxygen 
reductases from A. pernix (aa3 and ba types) (161), 
and cytochromes and another type II NADH oxidase 
from S. metallicus (17, 118). The respiratory chain 
of the haloalkaliphilic archaeon Natronobacterium 
pharaonis has also been studied (350). A small 
peripheral blue copper membrane protein, halo- 
cyanin, appears to have a similar function to plasto- 
cyanin. A small, two-subunit bc cytochrome that 
lacks a Rieske FeS center may be analogous to the bc1 
complex, and a large heterodimer carries two differ- 
ent ba3-type hemes and copper and may be the 
terminal oxidase. 

The archaeal respiratory chain complexes are 
diverse and differ from the canonical respiratory 
chains of mitochondria, P. denitrificans, and other 
complexes found in Bacteria. Thus archaeal com- 
plexes highlight the evolutionary diversity of respira- 
tory chains and, in particular, expand the knowledge 
of aerobic respiration in extreme habitats. 


Methanogenic Archaea 


Numerous members of the Euryarchaeota pro- 
duce methane as the major end product of their en- 
ergy metabolism (see Chapter 13). Methane and bio- 
logical methane production were discovered in the 
eighteenth century when the Italian physicist Alessan- 
dro Volta collected gas from anaerobic sediments of 
swamps and marshes and showed that it was flam- 
mable (combustible air) (253). Methane formation is 
the last step in the anoxic biodegradation of biologi- 
cally derived organic compounds (and often xenobi- 
otics) where electron acceptors other than CO, are 
limiting. Methane is important in the global carbon 


cycle and is a greenhouse gas. Approximately 80 to 
85% of the global annual production (600 to 1,200 Tg) 
of methane is biogenic (141, 319). Biogenic methane 
may be produced exclusively by methanogenic ar- 
chaea, although recently it was suggested that plants 
may be a source of methane ($10 to 20%). Plant 
methane is reported to be liberated by nonenzymatic 
and presently unknown reaction(s) (191). Bacteria 
also produce small amounts of methane from the de- 
methylation of methyl group carriers (e.g., S-adeno- 
sylmethionine) (325). 

Natural habitats for methanogens include anoxic 
freshwater swamps, ocean and lake sediments, hy- 
drothermal vents, animal digestive tracts (e.g., the 
rumen, termite hindgut, and humans), and within 
anaerobic protozoa. Important man-made habitats 
include rice paddies, landfills, and anaerobic sludge 
digesters of sewage treatment plants. The typical 
growth substrates of methanogens are H, and CO», 
or short-chain (C4 to C5) organic compounds (re- 
viewed in references 34, 109, 141, 188). In some en- 
vironments, methanogens are out-competed for these 
substrates by bacteria (e.g., homoacetogenic and 
sulfate-reducing bacteria in the termite hindgut and 
marine sediments, respectively). Approximately two 
thirds of methane that is produced does not reach the 
atmosphere as it is reoxidized, either aerobically by 
methylotrophic bacteria or anaerobically by a non- 
culturable consortia of methylotrophic archaea and 
sulfate-reducing bacteria (319, 365). 

The Black Sea benthos and marine sediments 
near to methane hydrate deposits are examples of 
habitats where “methanotrophic methanogens” reside 
(methanotrophic archaea closely related to metha- 
nogenic archaea of the order Methanosarcinales) 
(365). These archaea perform an important process in 
the global carbon cycle by performing anaerobic oxi- 
dation of methane (AOM) to COs, coupled to the re- 
duction of sulfate to HS, and they are present in suf- 
ficient abundance in the Black Sea benthos to be 
studied in situ. 

Cultivated methanogens belong exclusively to 
the Euryarchaeota. They are phylogenetically very di- 
verse and are divided into four to five orders (Fig. 1) 
(see “Phylogeny of Archaea and the origin of life,” 
above). 


Physiology of methanogenesis 


Small carbon compounds are converted to 
methane by pure cultures of methanogens. The most 
abundant substrates are H, + CO3;, formate, and ac- 
etate. Additional C-1 substrates include methanol, 
mono-, di-, or trimethylamine, and dimethyl sulfide, 
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and some higher order alcohols such as ethanol, 2- 
propanol, 2-butanol, and cyclopentanol. These sub- 
strates are converted stoichiometrically to methane 
and CO >. Methanogens do not tend to use typical 
bacterial growth substrates, such as sugars, amino 
acids, or complex nutrient mixtures (e.g., yeast ex- 
tract). However, some of these compounds can stim- 
ulate growth, and many methanogens require more 
complex nutrients or acetate for carbon assimilation 
(34, 109, 141, 188). Cultures of methanogens and 
anaerobic bacteria can degrade complex carbon sources 
in a cooperative and symbiotic manner. Cocultivation 
of H>-producing and sugar-utilizing clostridia with H3- 
consuming methanogens keeps the partial pressure of 
H, low and facilitates growth of both types of mi- 
croorganisms. Substrate-level phosphorylation (typical 
for fermentative bacteria), does not seem to occur in 
methanogens. Chemolithoautotrophic growth on CO, 
plus H; is a feature of methanogens that is not re- 
stricted by phylogeny. In contrast, acetoclastic metha- 
nogens that disproportionate acetate to CH, and CO, 
are confined to the Methanosarcinales (188). 

All methanogens have a blue-green fluorescence 
under the microscope that results from a specific ar- 
chaeal flavin derivative, F420. Methanogens utilize 
several highly specific coenzymes for carbon reduc- 
tion, coenzymes that are found in only a few special- 
ized bacteria. Some cofactors are C, carriers (metha- 
nofurane and methanopterin), while others are redox 
active; HS-HTP (also known as 7-mercaptohepta- 
noylthreonine phosphate or coenzyme B), F420, nickel- 
containing F439, and in some species, methanophe- 
nazine. All pathways of methanogenesis converge at 
the cofactor, coenzyme M (CoM). CoM and HS-HTP 
form an energy-rich mixed disulfide. The reduction of 
this disulfide is the energy-yielding step in methano- 
genesis (65, 141). 

H; is a typical electron donor for CO, reduction, 
and electrons can also be derived from formate, CO, 
or specific alcohols. Methanol and methylamines are 
converted by a methylotrophic pathway to CH4 and 
CO; in a 3:1 ratio. The biochemistry of methanogen- 
esis has been extensively reviewed (65, 66, 141) (see 
Chapter 13), and key steps are summarized below. 


CO, reduction: CO, is activated by a molybde- 
num- or tungsten-containing formylmethanofuran 
dehydrogenase, which is an ion pump catalyzing 
the reversible formylation of methanofuran cou- 
pled to the reduction to the formyl state with elec- 
trons derived from H; via F479. A sodium motive 
force drives this endergonic reaction. 

Formyl transfers and reduction: The formyl 
group is transferred to a methanopterin-contain- 


ing enzyme, dehydrated, and reduced to the 
methyl level in two steps. The methyl group is 
then transferred to CoM. This reaction is exer- 
gonic and catalyzed by a Na*t-pumping methyl- 
transferase enzyme complex, thus creating a 
sodium motive force. 

Methyl-CoM reduction: The F439-containing 
methyl reductase accepts the methyl group from 
CoM, forming a Ni-methyl intermediate. Methane 
is liberated following reduction by electrons de- 
rived from the formation of a mixed disulfide 
consisting of coenzymes M and B. The exergonic 
reaction is catalyzed by the soluble methylcoen- 
zyme M reductase. 

Heterodisulfide reduction: The mixed disulfide is 
the terminal electron acceptor. It is reduced to the 
individual components by the heterodisulfide re- 
ductases, which are central enzymes in metha- 
nogens. The electron donor is H3. 

Acetate cleavage: Acetate is activated by a kinase 
in species such as Methanosarcina thermophila 
and Methanosarcina barkeri when growing on 
acetate. The key steps of the pathway are cat- 
alyzed by the acetyl-CoA decarbonylase/synthase 
(ACDS), a complex nickel enzyme that cleaves 
the C-C and C-S bonds in acetyl-CoA and oxi- 
dizes CO to CO . The methyl group is trans- 
ferred to tetrahydrosarcinapterin. 


Methanobacteriales and Methanopyrus 


The Methanobacteriales are predominantly rod- 
shaped and form chains or filaments. Most species 
stain gram positive because of pseudomurein in cell 
walls. Methanobacteriales isolates have been ob- 
tained from all kinds of strictly anaerobic habitats, in- 
cluding sewage digestors, hydrothermal vents and ter- 
restrial volcanic hot springs, freshwater sediments, 
rice paddies, and gastrointestinal tracts of animals. 
They tend to grow optimally at low-salt concentra- 
tions with a Top: of 15 to 97°C (34). 

The order Methanobacteriales consists of two 
families, the Methanobacteriaceae and Methanother- 
maceae. The Methanobacteriaceae are divided into 
the four, widely distributed genera, Methanobac- 
terium, Methanobrevibacter, Methanosphaera, and 
Methanothermobacter. All grow by H, oxidation, 
with CO, as terminal electron acceptor, although 
some species can use formate, alcohols, and/or CO 
as alternative electron donors (34). Methanobac- 
terium formicicum is frequently isolated from sewage 
sludge and has been an intensively studied strain. 
Methanobrevibacter species are typical inhabitants 
of ruminants, termites, and other animals, including 
humans (240). Methanobrevibacter oralis has been 
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isolated from human subgingival plaque (237), while 
various Methanobrevibacter smithii strains appear to 
be the most abundant archaeal species in the mouth 
and in the intestines (73, 237). These two species of 
Methanobrevibacter have been implicated as possi- 
ble causative agents of periodontitis in humans, al- 
though other research groups have not been able to 
confirm the results (237, 370, 415). Methanosphaera 
species differ from other members of this family by 
reducing methanol instead of CO, as a terminal elec- 
tron acceptor. The genome sequence of Methano- 
sphaera stadtmanae was recently released (GenBank: 
NC_007681) (34). 

The genus Methanothermobacter was recently 
proposed, with members being derived from the genus 
Methanobacterium (426). Methanothermobacter therm- 
autotrophicus and M. marburgensis were previously 
described as Methanobacterium thermautotrophicum 
strain Delta-H and strain Marburg, respectively. The 
genome sequence of Methanothermobacter thermau- 
totrophicum was one of the first archaeal genomes to 
be published (373). Methanothermobacter species are 
H,/CO, utilizing thermophiles with a Top: of 55 to 
65°C (34). Some strains can also use formate to re- 
duce CO . Most strains were isolated from anaerobic 
sludge digestors, despite the fact that the T,,, of all 
species exceeds the typical digester temperature 
(37°C). Relatives of Methanothermobacter are found 
worldwide in hot springs. 

In contrast to the Methanobacteriaceae, the 
Methanothermaceae are represented by only one 
genus, Methanothermus. The two species of this 
genus, M. fervidus and M. sociabilis, were isolated 
from near-neutral Icelandic solfataras and grow by 
H; oxidation coupled to CO, reduction with a Tmax 
of 97°C (34). 

Methanopyrus kandleri was isolated from a deep 
hydrothermal vent in the Gulf of California and has a 
similar physiology to members of the Methanother- 
maceae (47). It is capable of growing at a temperature 
that is hotter than any other methanogen and has a 
growth temperature range of 80 to 110°C utilizing 
H,/CO>. M. kandleri has the highest intracellular con- 
centrations of cyclic 2,3-bisphosphoglycerate (1.1 M), 
and this may help to stabilize proteins against ther- 
mal denaturation (366). Consistent with the high con- 
centration of this intracellular solute, most Methano- 
pyrus proteins exhibit optimal activity at high-salt 
concentrations (>1 M). M. kandleri has an unusual 
membrane composition consisting of terpenoid lipids. 
The phylogeny of Methanopyrus has been debated, 
and it appears either to be a member of a distinct or- 
der, the Methanopyrales, or a fast-evolved Methano- 
bacteriales species (see “Phylogeny of Archaea and the 
origin of life,” above) (see Chapter 19). 


Methanococcales 


The Methanococcales form a phylogenetically and 
physiologically coherent order of methanogens. For 
example, the similarity of the 16S rDNAs of the 
mesophilic Methanococcus maripaludis and the hyper- 
thermophilic Methanocaldococcus jannaschii is 88% 
(430). All members are fast-growing (0.5 to 2 h dou- 
bling time), irregular cocci that require salt. They all 
utilize CO3/H>, while only a few species are able to use 
formate (but no other substrate) as electron donor. 
Similar to M. maripaludis, some Methanococcales fix 
atmospheric nitrogen. The cell wall is composed of a 
proteinaceous S layer, and most strains are motile. The 
greatest phenotypic diversity in this order relates to 
their Tope- 

The Methanococcales form two families with 
two genera in each (430). The Methanocaldococ- 
caceae comprise the hyperthermophilic genera 
Methanocaldococcus and Methanotorris, while the 
mesophiles and moderate thermophiles are in the 
family Methanococcaceae. The genus Methanococ- 
cus includes mesophilic species, with thermophilic 
species in the genus Methanothermococcus. All iso- 
lates of Methanococcales have come from marine 
environment, such as Methanococcus vannielii from 
the San Francisco Bay and M. voltae from estuarine 
sediments in Florida. The moderately thermophilic 
Methanothermococcaceae were isolated from geot- 
hermally heated coastal sediments and from North 
Sea oil fields. The hyperthermophilic species within 
the genera Methanocaldococcus and Methanotorris 
are widespread in submarine hydrothermal systems. 

Methanocaldococcus jannaschii, previously named 
Methanococcus jannaschii, has the distinction of be- 
ing the first hyperthermophile to be isolated from 
deep-sea vents, and the first cultivated hyperther- 
mophilic methanogen (174). It was also the first ar- 
chaeon, and only the second microorganism, to have 
its complete genome sequenced (46). As a result, it 
has become a model organism, in particular, for struc- 
tural biology. Numerous crystal structures have been 
determined for proteins with known and unknown 
functions, including the cell division protein FtsZ, the 
FLAP and splicing endonucleases, topoisomerases, 
translation initiation factors, DNA-binding proteins, 
ribosomal proteins, and many more (http://www. 
ncbi.nlm.nih.gov/Structure/). M. jannaschii was iso- 
lated from deep-sea vents with in situ hydrostatic pres- 
sure of 20 to 30 MPa and, as a result, has served as a 
model organism for studying the effects of pressure on 
growth, metabolism, and enzyme activities. Several en- 
zymes (e.g., hydrogenases) have significantly increased 
half-lives at elevated pressure (97). Pressure has also 
been shown to exert a lipid-ordering effect (177). 


74 KLETZIN 


Genetic systems have been developed for Metha- 
nococcus voltae and M. maripaludis, including meth- 
ods for transformation, shuttle and expression vec- 
tors, antibiotic resistance markers, and reporter genes 
(reviewed in reference 407) (see Chapter 21). A phage 
transduction system has been described for M. voltae 
(81). The genome sequence is available from M. mari- 
paludis (143). Mechanisms of gene regulation of 
motility, nitrogen fixation, and hydrogenases have 
been extensively studied in M. voltae (430). Four hy- 
drogenase operons are present; two encode selenium- 
dependent enzymes containing selenocysteine residues 
at the active site NiFe cluster, and the others encode 
selenium-free enzymes with the canonical set of four 
cysteine residues coordinating a Ni ion. Both pairs in- 
clude coenzyme F499-reducing and F499-non-reducing 
enzymes. The selenium-free hydrogenases are encoded 
by two operons divergently transcribed from a 453- 
bp intergenic region. Gene expression was only ob- 
served when selenium was absent in the growth 
medium and seems to be negatively and positively 
regulated from a silencer region and a binding site for 
a transcriptional activator (281). 


Methanomicrobiales 


Methanomicrobiales have a diverse morphology 
that includes rods, irregular cocci, and disc-shaped 
cells (109). They are also distinguished from other 
methanogens by their cell wall and lipid composition. 
All utilize H,/CO., and some species may use alco- 
hols and/or formate. Methanomicrobiales are gener- 
ally distinguished from the Methanobacteriales by 
their proteinaceous S-layer cell wall, and from the 
Methanosarcinales by their inability to use acetate or 
organic Cı compounds (other than formate) as sub- 
strates for methanogenesis. However, some species re- 
quire acetate as an organic carbon source for assimi- 
lation. At least twenty-four species grouped into nine 
genera and three families have been described, high- 
lighting the cultivatable diversity of this taxon. The 
Methanomicrobiales have not been as well studied as 
other methanogens. For example, only a single 3D 
structure is available (the luciferase-like, F4.9-depen- 
dent, secondary alcohol dehydrogenase from Metha- 
noculleus thermophilicus) (12). Genome sequence 
data are available for the psychrophile Methanoge- 
nium frigidum (348), and recently from Methanospir- 
illum hungatei (GenBank: NC_007796). 

Methanomicrobiales species have been isolated 
from sewage sludge digestors, marine and freshwater 
sediments, oil field reservoirs, and the rumen (109). 
Molecular ecology studies have identified Metha- 
nomicrobium mobile as a predominant methanogen in 
a wood-fermenting bioreactor, representing ~90% of 
the methanogen population (252). This predominance 


has not been observed in other types of digestors, 
which may reflect its requirement for specific wood 
substrates. Methanomicrobiales and Methanosarci- 
nales have been found to represent the majority of 
methanogens in a study of 21 different digestors 
(318). Methanomicrobiales are the second most abun- 
dant methanogens typically found in ruminants, 
Methanosarcinales being the most abundant (e.g., 
397). In the bovine rumen, Methanomicrobium mo- 
bile and Methanobrevibacter species were found in 
abundance (167). The psychrophile Methanogenicum 
frigidum was isolated from the cold hypolimnion of a 
seawater-derived Antarctic lake and grows between 1 
and 17°C (Table 10) (51, 98). As a psychrophile, M. 
frigidum expands the temperature range for cultivated 
methanogens (98). In contrast, the phylogenetically re- 
lated thermophile, Methanoculleus thermophilicum, 
was isolated from sediments of high-temperature ef- 
fluent channels of a nuclear power plant (326). 

A few species of Methanomicrobiales are endo- 
symbionts of anaerobic protozoa that are associated 
with the hosts’ hydrogenosome or cytoplasm. Hy- 
drogenosome association suggests a hydrogen trans- 
fer to the methanogen (27, 84, 85). For example, the 
anaerobic protozoa Metopus, Trimyema, and 
Pelomyxa harbor endosymbionts related to Methano- 
corpusculum and Methanoplanus that are phyloge- 
netically different from their free-living relatives (84). 

Cells of Methanospirillum hungatei are rod 
shaped and show a unique ultrastructure when viewed 
with the electron microscope. Similar to other metha- 
nomicrobiales, they contain an S layer (109). How- 
ever, individual M. hungatei cells often form chains 
that are further enclosed by a proteinaceous paracrys- 
talline sheath. The M. hungatei sheath is resistant to 
denaturants, salts, or proteases, although it can be 
dissociated by reducing agents, suggesting that disul- 
fide bridges may function to stabilize the structure. 
Twenty percent of the sheath’s total mass consists of 
phenol-soluble proteins. A regular array of 2.8-nm- 
wide particles form the sheath (reviewed in reference 
109) (see Chapter 14). The sheath can withstand 
pressures of 30 to 40 MPa (measured by atomic force 
microscopy). It has been speculated that the sheath 
may function as a pressure regulator, opening in re- 
sponse to high intracellular CH, pressures and allow- 
ing a gas exchange with H, and CO, (438). Within 
the sheath, highly permeable spacer plugs separate 
the cells at their poles, indicating that they might play 
a role in cell division (109). 


Methanosarcinales 


Methanosarcinales species are widespread in all 
types of anaerobic environments. Cells are either 
sheathed rods or coccoid and often in groups or clus- 
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Table 10. Characteristics of selected species of methanogens” 


: Temperature pH Substrates for Hom . 
Order and species Morphology Source/habitat : 6 : : no. of species Remarks? 
optimum (°C) optimum methanogenesis : 
in genus 
Methanobacteriales and 
Methanopyrus 
Methanobacterium formicicum Long rods Anaerobic sludge digestors 30-45 7-7.5 H,/CO,; formate 13 
Methanobacterium bryantii Long rods Coculture with Methanobacillus 30-45 6.5-7.5 H/CO, 
omelianski 
Methanobrevibacter ruminantium Short rods Bovine rumen 37-39 6.3-6.8  H/CO;; formate 12 
Methanobrevibacter smithii Coccoid, chains Intestines of many animals, 37 6.9-7.4 H,/CO,; formate 
including humans 
Methanobrevibacter oralis Short rods Bovine rumen 35-38 6.3-6.8 H,/CO, 
Methanosphaera stadtmanae Coccoid Human feces 30-40 6.5-6.9 Methanol and H3 G 
Methanothermobacter Long rods Anaerobic sludge digestors 65 7.2-7.6  H,/COz 6 
marburgensis 
Methanothermobacter Long rods Anaerobic sludge digestors 65-70 6.7-7 H,/CO,; some G 
thermautotrophicus strains on formate 
Methanothermobacter wolfei Rods, sometimes Anaerobic sludge digestors nt n.r. H,/CO,; formate Growth stimulated 
coccoid by tungstate 
Methanothermus fervidus Rods Icelandic hot spring 80-85 6.5 H,/COz 2 Presence of a periplasm 
Methanopyrus kandleri Rods Abyssal hot vents, Guaymas, 95-98 =6.5 H,/COz 1 Temperature range 80-110°C, 
and Kolbeinsey Ridge requires salt 
Methanococcales 
Methanococcus vannielii Cocci Marine sediments 35-40 6.5 H,/CO,; formate 4 Motile, halophilic (55% salt) 
Methanococcus voltae Cocci Marine sediments 35-40 6.5 H,/CO,; formate Motile, halophilic (55% salt) 
Methanococcus maripaludis Cocci Marine sediments 35-40 6.5 H,/CO,; formate Motile, halophilic (55% salt), 
N- fixation 
Methanothermococcus Cocci Submarine vents 60-70 6.5-7.5  H/CO;; formate 2 Motile, halophilic 
thermolithotrophicus (£9% salt, N> fixation (some) 
Methanocaldococcus jannaschii Cocci White smoker East Pacific Rise 80-85 6-6.5 H,/CO, 5 G; motile, halophilic (55% salt) 
Methanotorris igneus Cocci Kolbeinsey Ridge, Iceland 85-88 5.5-6 H,/COz 2 Nonmotile, halophilic 
(S7% salt) 
Methanomicrobiales 
Methanocorpusculum parvum Irregular cocci Sour whey digestor 35-40 6.7-7.5 H/CO»; formate; 5 Motile, requires tungstate 
2-propanol + CO, 
Methanoculleus thermophilus Irregular cocci Sludge digestors, marine 55-60 6.7-7.2 H,/COs; formate 6 Motile 
sediments 
Methanofollis liminatans Irregular cocci Industrial wastewater digestor 40 7 H,/CO,; formate; 5 Motile, tungstate stimulatory 
CO, + 2-propanol 
or 2-butanol or 
cyclopentanol 
Methanogenium frigidum Irregular cocci Ace Lake, Antarctica 15 7.5-7.9  H/CO;; formate 3 G*, temperature 0-17°C, 


slightly halophilic 


(Continued) 
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Table 10. Continued 


; ; Temperature pH Substrates for Ton : 
Order and species Morphology Source/habitat : S : í no. of species Remarks’? 
optimum (°C) optimum methanogenesis : 
in genus 
Methanolacinia paynteri Irregular cocci Marine sediments 40 7 H,/CO,; CO, + 1 Highly irregular morphology 
2-propanol, 
2-butanol, 
cyclopentanol 
Methanomicrobium mobile Slightly curved, Rumen fluid 40 6.1-6.7  H,/CO,; formate 1 
short rods 
Methanoplanus endosymbiosus Slightly curved, Marine ciliate Metopus 32 6.8-7.3 H2/CO;; formate 3 Endosymbiont 
short rods contortus 
Methanospirillum hungatei Long curved rods Various 30-37 6.6-7.4 H,/CO,; formate; 1 G, flagella and sheaths 
CO, + 2-propanol, 
2-butanol 
Methanocalculus halotolerans Coccoid Oil-producing well 38 7.6 H,/CO,; formate 4 Halophile, salt concentration 
0-12% 
Methanosarcinales 
Methanosaeta concilii Straight rods Anaerobic sludge digestors 35-40 7.1-7.4 Acetate 2 
Methanomicrococcus blatticola Flat polygonal Cockroach hindgut 39 7.2-7.7 H, + methanol or 1 
spheroids methylamines 
Methanothrix thermophila Coccoid, aggregates Khlorid Lake, Kamchatka, 50 7.5 Acetate, methanol, i 
Russia methylamines 
Methanococcoides burtonti Irregular cocci, Ace Lake, Antarctica 23 7.7 Methanol, 3 G*, motile 
aggregates methylamines 
Methanohalobium evestigatum Flat polygonal Salt lagoons, Sivash, Ukraine 50 7-7.5 Methylamines 1 Extremely halophilic: optimal 
spheroids, NaCl concentration 4.2 M 
aggregates 
Methanobhalophilus mahii Irregular cocci, Great Salt Lake, Utah, USA 35 T3 Methanol, 4 Halophilic: optimal 
aggregates methylamines NaCl concentration 2 M 
Methanolobus oregonensis Irregular cocci, Alkaline, saline aquifer from 35 8.6 Methanol, methyl 5 
aggregates Oregon, USA sulfides, 
methylamines 
Methanosalsum zhilinae Irregular cocci, Wadi el Natrun, Egypt 45 92 Methanol, 1 Motile, moderately halophilic: 
aggregates methylsulfides, optimal NaCl concentration 
methylamines 0.7M 
Methanosarcina acetivorans Cocci, aggregates Marine sediments 35-40 6.5-7 Acetate; methanol, 9 
methylamines 
Methanosarcina barkeri Cocci, aggregates Anaerobic sludge digestors 45 7.0 H,/COs; acetate; G, gas vesicles 
methylamines, 
methanol 
Methanosarcina mazei Cocci, aggregates Anaerobic sludge digestors 40-42 6.8-7.2  H>/CQn; acetate; G 
methylamines, 
methanol 
Methanosarcina vacuolata Cocci, aggregates Anaerobic sludge digestors 40 75 Acetate Gas vesicles 
Methanomethylovorans Cocci, aggregates Eutrophic lake, The Netherlands 37 6.5-7.0 Methanol, Gas vesicles 
hollandica dimethylsulfide, 
methylamines, 


methane thiol 


“Data compiled from several sources (34, 109, 188, 430), including the taxonomy browser (http://www.ncbi.nih.gov) and the German culture collection (http://www.dsmz.de). All currently recognized genera are represented but 


not all species. 


PG, genome sequence available; G*, draft genome sequence. 


en.r., not reported. 
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ters. They usually have a proteinaceous cell wall, 
while some are additionally surrounded by a sheath 
or acidic heteropolysaccharide; pseudomurein is not 
present (reviewed in reference 188). All acetoclastic 
methanogens are species of Methanosarcinales (65). 
This order represents methanogens, which are capa- 
ble of utilizing the widest range of substrates (CO>/ 
H, disproportionation of Cy compounds or methy- 
lated amines, chloroform, methyl sulfides, and ac- 
etate), and some individual strains can use most of 
these substrates (188). 

The two Methanosarcinales families, the Metha- 
nosarcinaceae and the Methanosaetaceae, are distin- 
guished by morphological and physiological proper- 
ties. The Methanosarcinaceae have a G+C content 
of 35 to 46% and form coccoid or pseudosarcinal 
cells; many are motile and can use a wide range of 
substrates. The Methanosarcinaceae include halo- 
philic genera isolated from hypersaline environments. 
Methanosaeta concilii and Methanosaeta thermo- 
phila, the only two species of the Methanosaetaceae, 
are nonmotile sheathed rods. They use only acetate 
for methanogenesis. Their G+C content is 49 to 
54%. Methanosaeta strains often outcompete Metha- 
nosarcina spp. in environments where acetate con- 
centrations are low due to their transport systems 
possessing a low k,,, for the substrate. In contrast, 
Methanosarcina strains often predominate at low-pH 
values and high-acetate concentrations (188). The 
nine cultivatable Methanosarcina species are slightly 
halotolerant and/or slightly halophilic and are non- 
motile. They metabolize acetate, methylamines, me- 
thanol, and CO. Some strains also grow on CO//H3 
(188), and several can dechlorinate chloroform and 
other highly chlorinated small alkans (9, 49, 149). 
Some species, such as Msc. vacuolatum and a number 
of Methanosarcina mazei strains, form gas vesicles 
(36). Gas vesicle synthesis genes are present in the 
genome sequence of Methanosarcina barkeri (Fig. 14) 
but not the M. mazei strains. At least four species 
were isolated from anaerobic digestors, where they 
are particularly common (188). Knowledge of the 
metabolic diversity of the Methanosarcinales was re- 
cently improved, and Methanosarcina acetivorans 
strain C2A was found to form acetate and formate, 
rather than methane, as the major metabolic end 
products for energy conservation when growing on 
CO (333). Consistent with an inhibitory effect of CO, 
methane production was found to decrease with in- 
creasing CO concentration. 

Methanosarcina barkeri, M. mazei, and M. ace- 
tivorans are the most well studied methanogens and 
are important model organisms for studies on aceto- 
clastic methanogenesis, transcription, and chaper- 
onins (65, 146, 251). Genome sequences are available 


from M. barkeri (GenBank: NC_007355), M. ace- 
tivorans C2A (GenBank: NC_003552), and M. mazei 
Go-1 (GenBank: NC_003901). 

Other genera of the Methanosarcinaceae are 
moderately halophilic, growing on organic C, com- 
pounds at the salinity of seawater (188). Methanolobus 
and Methanococcoides species were isolated from sea 
sediments or from terrestrial salt lakes. Similar to 
Methanogenium frigidum, Methanococcoides burtonii 
was isolated from the anoxic hypolimnion of Ace Lake, 
Antarctica, and is a psychrophile. The genome sequence 
was recently completed, and the organism serves as a 
model for cold adaptation (51, 121, 348). In contrast, 
Methanohalophilus and Methanosalsum species have 
been isolated from hypersaline lakes or soda lakes, re- 
spectively (188). They grow at optimal salt concentra- 
tions of 0.7 to 1.5 M (Table 10). Methanosalsum 
zhiliniae and Methanolobus orogensis are the only 
Methanosarcinales isolates with an alkaline pH opti- 
mum (9.2 and 8.6, respectively), and other members of 
this family are neutrophiles. In contrast, Methanohalo- 
bium evestigatum is a moderately thermophilic halo- 
phile with an optimal salt concentration of 4.3 M, a 
Topt Of 50°C that is found in hypersaline, NaCl-satu- 
rated environments. M. evestigatum grows on methy- 
lamines, while other, more moderately halophilic, 
methanogens (e.g., Methanococcoides burtonii) can 
also use methanol. 


(Anaerobic) methane oxidation 
by methanotrophic archaea 


Numerous aerobic bacteria oxidize methane us- 
ing a methyl oxygenase for the primary activation of 
the substrate. Alternatively, several cultivated methy- 
lotrophic bacteria, such as Methylococcus capsulatus, 
have a reversed methanogenesis pathway for methane 
oxidation (55). A third means of methane oxidation 
is AOM, which is carried out anaerobically by uncul- 
tivated methylotrophic archaea that are closely re- 
lated to the Methanosarcinales (reviewed in reference 
365). Several phylogenetically distinct lineages that 
are associated with AOM have been identified (Fig. 1). 
They appear in situ to be in a syntrophic association 
with sulfate-reducing 8-proteobacteria. The associa- 
tion is required to support metabolism by an extra- 
cellular electron transfer and to keep the free-energy 
change below the limits of —10 kJ mol~! for metha- 
nogenic archaea and —19 kJ mol™! for sulfate-reduc- 
ing bacteria. 

The AOM pathway involves enzymes from 
methanogenesis, including a homolog of the methyl 
CoM reductase (365). The Black Sea microbial mats 
are composed of >50% of archaea from the ANME 
1 cluster. In contrast to the open ocean, the water 
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body of the Black Sea is anoxic at shallow depth. Large 
amounts of methane are liberated in the sediments, 
which allows the development of large, slow-growing 
pillars that consist of microbial mats of AOM ar- 
chaea and sulfate-reducing proteobacteria (262, 276). 
In the mats, a homolog of methyl-CoM reductase rep- 
resents up to 7% of total protein (218). The reductase 
purified directly from mat samples primarily catalyzed 
the reverse reaction (oxidation of methane to methyl- 
CoM). The enzyme had comparable biochemical prop- 
erties to the canonical methanogen enzymes, with ex- 
ception that the Ni-containing F439 cofactor of the 
enzyme contained an unknown chemical modification 
that increased its molecular mass from 905 Da to 951 
Da. The Black Sea mats performed anaerobic, sulfate- 
dependent AOM in the laboratory, and methanogen- 
esis with different substrates did not occur. These re- 
sults are strong evidence that the mats carry out 
AOM rather than methanogenesis (218, 365). 

Very recently, AOM was found to occur in anaer- 
obic freshwater ecosystems within methane-producing 
sediments (316). A consortium consisting of methane- 
oxidizing archaea and novel bacteria was retrieved 
from a canal in the Netherlands and was cultivated 
successfully in the laboratory by coupling AOM to 
denitrification. The Bacteria belong to a phylogenetic 
branch, which had not been isolated before and was 
known solely from molecular ecological studies. These 
results suggest that AOM is much more widespread 
than anticipated and that it could be coupled to a va- 
riety of anaerobic respiration processes. 


PERSPECTIVE: THE NEXT FIVE YEARS 


Research on Archaea has generated impressive 
results in the almost 30 years that have passed since 
their recognition as a fundamentally different domain 
of life. Research has focused on the discovery and de- 
scriptions of novel organisms, the search for Archaea 
in habitats other than hot springs, salt lakes, and 
anaerobic sediments, the determination of their fun- 
damental cellular processes, and the determination 
of their major physiological pathways. An increasing 
number of Archaea have become amenable to genetic 
techniques, although a great scope exists for this to 
expand. Commercial applications have also surfaced, 
the most obvious being the proofreading DNA poly- 
merases used for PCR. 

Areas where significant progress may be ex- 
pected in the next five years include: 


Isolation. It is likely that an increasing number of 
nonextremophilic and/or nonmethanogenic iso- 


lates will be obtained. The novel strains will ex- 
pand understanding of the range of archaeal 
metabolic diversity and biochemical targets for 
enzymological studies. The isolates should begin 
to replace the uncultivated species that have been 
identified from molecular ecology, and particu- 
larly large-scale metagenomic studies, and add 
culturable Archaea to the “orphan branches” of 
the phylogenetic tree (Fig. 1). 

Consortia. In addition to attempting to isolate 
individual species of Archaea, focus is required 
on studying symbiotic and other mutualistic con- 
sortia of microorganisms that can only be grown 
in coculture or are not amenable to any form of 
laboratory-based cultivation. The Iron Mountain 
acid mine ecosystem (408) and the consortia of 
denitrifying anaerobic methane oxidizers (316) 
are examples of recently identified consortia that 
contain novel groups of Archaea. Realization of 
the importance of these systems is likely to stim- 
ulate future research of other types of archaeal 
consortia. 

Metagenomics. Metagenomic approaches have 
been responsible for dramatically expanding 
knowledge of microbial diversity and producing 
the highly diverged phylogenetic tree that, in par- 
ticular, highlights the number of uncultivated 
species (Fig. 1) (353). The future will add an in- 
creasing number of metagenomes, which should 
also aid in the isolation of culturable strains of 
Archaea and Bacteria. Metagenomics is likely to 
enable complete archaeal genomes to be con- 
structed without having isolated, or even recog- 
nized, the existence of the microorganism in the 
environment. Fluorescence staining, stable iso- 
tope probing, and optical tweezers techniques 
could make it possible to identify and isolate 
previously unknown species from environmen- 
tal samples. One consequence of these studies 
will be the generation of an expanded database 
of commercially and scientifically valuable en- 
zymes that will be able to be identified using in- 
creasingly powerful bioinformatic tools. 
Structural biology. The availability of archaeal 
genome sequences has triggered a large increase 
in protein structural data (see Chapter 20). Indi- 
vidual research groups and coordinated struc- 
tural genomics programs have both played im- 
portant roles. The structural genomics programs 
have passed the inevitable early-lag phase and 
are presently generating large volumes of data, 
and this is set to increase significantly in the near 
future. In particular, hyper/thermophilic archaea 
(and bacteria) are well suited as sources of pro- 
teins for structural studies. With improvements 
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in high-throughput robots for crystallization us- 
ing extremely small volumes of proteins, ar- 
chaeal targets will be able to be screened at in- 
creasingly higher speeds. The result of this 
archaeal research will be to significantly improve 
the fundamental understanding of cellular 
processes not only in the Archaea, but also in all 
other forms of life. 

Modeling. One of the current trends in systems 
biology is the attempt to model and predict 
metabolic fluxes within microorganisms and in 
ecosystems, signal transduction networks in large 
regulons and whole cells, and in vivo protein- 
protein interactions. These types of approaches 
will be enormously helpful in devising hypothe- 
ses about all aspects of cell biology and ecology, 
thereby enabling predictions to be tested experi- 
mentally. Archaea have been and will continue to 
be useful models for a broad variety of systems 
biology studies. 

Cell biology. Central processes of replication, 
transcription, and translation have been eluci- 
dated to a greater or lesser degree in all organ- 
isms. However, important details of these pro- 
cesses remain to be determined (see Chapters 3, 
6, and 8), and a great deal remains to be discov- 
ered about other cellular processes. It is likely 
that the speed in which knowledge is gained will 
continue to increase. The Archaea will remain 
model organisms for studying the more compli- 
cated eucaryal cellular systems. Molecular inter- 
action studies will help to resolve the protein-pro- 
tein interaction networks that occur throughout 
the archaeal cell cycle, and cryotomography will 
be a valuable tool for reconstructing high-resolu- 
tion images of the living cell. 

Physiology. As physiological studies of the Ar- 
chaea expand to the same degree as has occurred 
with Bacteria, the extent of metabolic capacity 
and the pathways of metabolism will begin to be 
realized (see Chapter 12), and will illustrate the 
diversity of physiological types that exist within 
the archaeal world. 
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Chapter 3 


DNA Replication and Cell Cycle 


SI-HOUY LAO-SIRIEIX, VICTORIA L. MARSH, AND STEPHEN D. BELL 


INTRODUCTION 


Replication of DNA is of fundamental impor- 
tance to all living organisms. Inappropriate or un- 
timely replication can have severe consequences at the 
cellular and organismal level. Replication must also 
be highly accurate to ensure faithful propagation of 
the genetic information, yet errors must be tolerated 
to permit the generation of diversity upon which evo- 
lutionary selective processes can act. This chapter de- 
scribes the recent advances that have been made in 
understanding the biochemical players that facilitate 
the complex macromolecular process that mediates 
faithful replication of archaeal chromosomes. Fur- 
thermore, it is clear that once DNA replication is ini- 
tiated it must progress to completion. Thus, cells have 
dedicated periods within the cell cycle to devote en- 
ergy and metabolites toward the highly costly task of 
replicating the genome. The current state of knowl- 
edge of the machineries that drive the archaeal cell cy- 
cle is discussed. 

Conceptually and mechanistically, DNA repli- 
cation can be split into several stages. First, a site at 
which replication initiates, an origin of replication, 
must be defined at the molecular level (Fig. 1A and 
B). This leads to the recruitment of a DNA helicase, 
an enzyme that utilizes energy in the form of ATP, to 
unwind the double helix of DNA at the origin, and to 
expose the single-stranded DNA template for synthe- 
sis of new DNA (Fig. 1C). The single-stranded DNA 
is stabilized by specialized single-strand-binding pro- 
teins, and DNA synthesis is initiated by the genera- 
tion of a short oligoribonucleotide primer (Fig. 1D). 
This primer is then extended by cellular DNA poly- 
merases (Fig. 1E). An additional complexity arises at 
this point due to innate chemical asymmetry of DNA. 
All known primases and DNA polymerases can only 


synthesize DNA in a 5’ to 3’ polarity. To overcome 
this potential problem, as the two strands of DNA 
are copied, one strand, the leading strand, is synthe- 
sized continuously, while the other strand is synthe- 
sized in short segments (Okazaki fragments), which 
require processing and joining to form covalently in- 
tact DNA molecules (Fig. 1F and G). Thus, there are 
very different requirements imposed on leading- and 
lagging-strand DNA polymerases. More specifically, 
the polymerase on the leading strand must synthesize 
DNA highly processively, remaining attached to the 
template for potentially many megabases. In contrast, 
the lagging-strand DNA polymerase must constantly 
undergo a repetitive cycle of recruitment, synthesis 
of a short DNA molecule followed by release of the 
template. These highly differing properties are im- 
posed, at least in part, by the regulated association 
of DNA polymerase with an accessory factor, the 
sliding clamp. The sliding clamp also serves as a cen- 
tral nexus for the coordination of the proteins in- 
volved in processing and joining Okazaki fragments 
(Fig. 1G). 

Table 1 summarizes the proteins that catalyze 
this complex cascade of events. It is readily apparent 
from this list that the machineries employed by the 
Archaea and Eucarya to perform these multiple tasks 
are closely related. Furthermore, although analogous 
activities are found in bacterial cells, the proteins that 
catalyze the events are nonorthologous with their 
archaeo-eucaryal counterparts. The relationship be- 
tween the archaeal and eucaryal proteins, coupled 
with the fact that the archaeal machinery is generally 
perceived as a simplified version of the eucaryal ma- 
chinery, has generated considerable interest in the 
Archaea as an experimentally tractable model for the 
fundamentally related yet massively more complex 
eucaryal replication machinery (5, 32, 59). 
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Figure 1. Cartoon of the steps involved in DNA replication. Panels A to E describe the assembly of replication fork components 
at an origin of replication. SSB has been omitted for visual clarity. Panels F and G show detail at a single replication fork. 


ORIGINS OF ORIGINS 


Despite the similarity between the DNA replica- 
tion proteins in Archaea and Eucarya, there is a fun- 
damental difference in the way that genomic DNA is 
organized in these two domains of life. In eucarya, 
chromosomes contain linear DNA molecules; in 
contrast, archaea and most bacteria have circular 
genomes. In addition, the DNA content of eucaryal 
cells can be three to four orders of magnitude greater 
than that in bacteria and archaea. 

This difference in organization and abundance of 
DNA between Eucarya and Bacteria is mirrored by 
the fundamentally different way in which they orga- 
nize DNA replication. Clearly, eucaryal cells do not 
take 10,000 times as long as bacterial cells to repli- 
cate their DNA. Rather, eucarya exploit a strategy of 


utilizing multiple initiation sites along their chromo- 
somes that are typically 10 to 300 kb apart (4, 96). 
The way in which these origins are defined and regu- 
lated is beyond the scope of this chapter but has been 
reviewed elsewhere recently (82, 96). In contrast, bac- 
teria use a single origin of replication per chromo- 
some (53). Given the organizational similarity be- 
tween bacterial and archaeal chromosomes, it might 
be anticipated that archaea would also have a single 
origin per chromosome. This supposition was con- 
firmed by the pioneering work of the Forterre labo- 
ratory, which provided strong evidence that there was 
a single origin of replication (oriC), in the Pyrococcus 
genome (87). Initial bioinformatics and labeling stud- 
ies predicted a single origin, and this was confirmed 
by a two-dimensional (2D) gel-mapping approach 
that allows the identification of replication intermedi- 
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Table 1. Identity of the factors that catalyze the various stages of DNA replication described in Fig. 1 


Factor Archaea* Eucarya? Bacteria (E. coli) 
Initiator Orc1/Cdc6 ORC DnaA 
Helicase loader Orc1/Cdc6 Cdc6 + Cdt1 DnaC 
Helicase MCM MCM complex DnaB 
Single-strand binding Various? RPA SSB 
Primase Primase (2) Primosome (4) DnaG 
Replicative DNA polymerase B-type (C+E)* D-type (E) Pol 8 and e DNA Pol M 
Sliding clamp PCNA PCNA B-Clamp 
Clamp loader RF-C RF-C y-Complex 
Ligase DNA lig 1 DNA lig 1 Ligase 
Processing RNaseH/Fen1 RNaseH/Dna2/Fen1 RNaseH 


“The eucaryal ORC complex has six subunits, Orc1 to Orc6. Archaea possess proteins that are homologous to both Orc1 and Cdcé6 and are termed Orc1/Cdc6; 


see the main text for discussion. 


’Distinct archaeal species have unique single strand binding proteins, hence the designation of “various.” 
‘There is a bifurcation in the distribution of polymerases in the Crenarchaeota (C) and Euryarchaeota (E). Crenarchaea possess only B-type replicative DNA 


polymerases, whereas euryarchaea have both B- and D-family enzymes. 


ates (77). Subsequently, the precise initiation site was 
mapped using high-resolution methodologies (78). 
The Pyrococcus origin lies immediately upstream of 
the gene encoding the Pyrococcus homolog of the eu- 
caryal DNA replication factors Orc1 and Cdcé6 (see 
footnote to Table 1). This organization was immedi- 
ately reminiscent of that seen in many bacteria, where 
the gene for the DnaA initiator is adjacent to the ori- 
gin of replication (83). Thus, Pyrococcus appears to 
use a bacterial-like mode of DNA replication with 
eucaryal-like machinery (87). 

At this time, however, bioinformatics studies, 
based on the poorly understood observation that 
leading and lagging strands often possess distinct se- 
quence properties, were beginning to give hints that 
other archaeal species may possess more than one ori- 
gin of replication (114). In particular, Halobacterium 
was proposed to have two origins of replication, both 
adjacent to genes for Orc1/Cdc6 homologs. However, 
a targeted genetic screen for autonomous replication 
function only found evidence for one of these pre- 
dicted origins (upstream of the orc7 gene for an 
Orc1/Cdc6 homolog). As this screen tested a limited 
number of targets, it is currently unclear if additional 
origins may exist in Halobacterium (7). 

The first experimental evidence for multiple ori- 
gins of replication in an archaeon came from studies 
of the crenarchaeon Sulfolobus solfataricus. A 2D 
gel-mapping analysis, employing a targeted approach, 
identified origins of replication (oriC1 and oriC2) in 
noncoding regions upstream of two of the three Sul- 
folobus homologs of Orc1/Cdcé6 (cdc6-1 and cdc6-3) 
(97). No origin activity could be found within 20 kb 
of the final Orc1/Cdc6 homolog, cdc6-2. The targeted 
nature of this screen meant that additional origins 
may exist in the Sulfolobus chromosome, and a 
whole-genome marker-frequency analysis by Bernan- 
der and colleagues provided strong evidence for a 


third origin of replication, about 80 kb removed from 
cdc6-2 (71). The marker-frequency analysis also in- 
dicated that, following release from a cell cycle 
block, replication initiated from all three origins si- 
multaneously, suggesting a mechanism for coordi- 
nated control at these three sites. Whether this is due 
to colocalization of the origins in a single controlling 
“factory” structure or due to the presence of a trans- 
acting “start” signal is currently unknown. 

Thus far, all archaeal origins characterized share 
some common features. They have stretches that are 
highly rich in A and T bases, indicative of readily 
meltable DNA (7, 87, 97). Indeed, the precise nu- 
cleotides at which replication initiates have been 
mapped in Pyrococcus and Sulfolobus and are adja- 
cent to, or within, these candidate duplex unwinding 
elements (DUEs). In addition, all origins possess a 
number of repeated sequence elements. Biochemical 
experiments using purified Sulfolobus Orc1/Cdcé ho- 
mologs revealed that these sequence motifs are bound 
by differing subsets of the Orc1/Cdc6 homologs (97). 
Indeed, extensive inverted repeat elements, termed 
origin recognition boxes (ORBs), were bound by the 
Cdc6-1 protein (Fig. 2). Candidate ORB elements 
were also identified in the Pyrococcus and Halobac- 
terium origins and were demonstrated to be capable 
of binding Sulfolobus Cdc6-1 in vitro. Thus, it ap- 
pears that ORB consensus elements are a feature of 
several archaeal origins of replication. However, 
oriC2 in Sulfolobus does not have full ORB elements, 
but rather has a shorter motif that corresponds to a 
central conserved core of the ORB element (termed 
mini-ORB or mORB). These mORBs bound Cdc6-1 
with reduced affinity compared with full ORBs. 
Cdc6-3 protein stimulated binding of Cdc6-1 to 
mORBs (97). 

It may be significant that in Methanothermobac- 
ter thermautotrophicus, an organism that lacks a 
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Figure 2. Sequence conservation of ORM and mini-ORB (m-ORB) 
elements at archaeal origins of replication. These serve as binding 
sites for orthologs of the Sulfolobus Cdcé6-1 protein. The arrows in- 
dicate an imperfect inverted repeat found in the elements. Sso, S. 
solfataricus; Halo, Halobacterium NRC1; Pab, P. abyssii). 


Cdc6-3 homolog, the predicted origin has no full 
ORB elements but, instead, has multiple mORB ele- 
ments (18). It is possible that cooperative interactions 
between the M. thermautotrophicus homologs of Sul- 
folobus Cdc6-1 bound to the mORBs may contribute 
to high-affinity binding to this origin. In this light, 
note that the bacterial initiator protein, DnaA, can 
bind cooperatively to noncanonical DnaA-binding 
sites (84). 

The studies on Sulfolobus Orc1/Cdcé6 proteins 
also revealed that the Cdc6-2 protein bound to mul- 
tiple repeats at the origins (97). Furthermore, these 
repeats overlapped ORB elements or Cdc6-3-binding 
sites. This organization suggested that Cdc6-2 might 
compete for binding to DNA with Cdc6-1 and/or 
Cdc6-3. Analyses of the protein levels of Orc1/Cdc6s 
over the cell cycle revealed that Cdc6-1 and Cdc6-3 
are highest in prereplicative and replicating cells and 
that Cdc6-2 peaks in postreplicative cells. This tight 
temporal partitioning of protein levels, coupled with 
the overlapping nature of the binding sites for Cdc61/3 
and Cdc6-2, suggests distinct roles for the proteins. 
Possibly the simplest hypothesis is that Cdc6-1 and 
Cdc6-3 act to promote replication and the Cdc6-2 
serves as an inhibitor of re-replication events. 

Thus, it appears that while some archaea possess 
single origins of replication, other species possess mul- 
tiple origins. Why do some archaea, with genomes 
smaller than many bacteria, have multiple origins? 
One possibility may lie in the slow rate of genome 
replication seen in Sulfolobus species; this organism is 
estimated to replicate DNA at about 6 kb min~! (71). 
To ensure sufficiently rapid replication of the genome, 
Sulfolobus may therefore have evolved multiple ori- 


gins. In contrast, Pyrococcus uses a single oriC, and it 
appears that the fork progresses at 20 kb min~! (87). 
It is also possible that species with multiple origins 
exploit differential usage of the origins to modulate 
the growth rate of the cells. In this regard, it will be of 
considerable interest to determine whether there is 
differential usage of origins in Sulfolobus. 


MELTING THE ORIGIN 


In bacteria, the initiator protein, DnaA binds the 
origin and facilitates an initial melting of the duplex, 
resulting in the replicative helicase, DnaB, being re- 
cruited to a melted bubble of DNA (83, 84). In con- 
trast, there is no evidence in either eucarya or archaea 
that recognition of the origin(s), by origin recognition 
complex (ORC) or Orc1/Cdc6, respectively, melts 
DNA. It is, therefore, currently assumed that minichro- 
mosome maintenance (MCM), the presumptive re- 
plicative helicase, is initially recruited onto double- 
stranded DNA. In both bacteria and eucarya, the 
initiator proteins require additional cofactors to facil- 
itate recruitment and loading of the replicative heli- 
case. Bacteria utilize DnaC to facilitate this reaction 
(2, 12, 26, 33, 38), and eucarya use Cdt1 and Cdc6 
(4). There are no clear primary sequence homologs of 
Cdt1 in archaeal genomes. Furthermore, as discussed 
in the footnote to Table 1, archaeal Orc1/Cdcé6s share 
similarity to both Orc1 and Cdc6, and it has been 
proposed that these proteins may play joint roles in 
marking origins and loading MCM. However, to date 
no system for MCM loading in the Archaea has been 
published and, thus, protein requirements for the re- 
action remain undefined. 

Nevertheless, strong indications have emerged 
from the work of Kelman and colleagues that there is 
indeed a direct physical and functional interaction 
between archaeal Orc1/Cdc6 homologs and MCM 
complex (103). Several laboratories (19, 22, 42, 62, 
80, 101) have demonstrated that the homomulti- 
meric archaeal MCM (a double hexamer in M. therm- 
autotrophicus and single hexamer in Sulfolobus and 
Archaeoglobus) possesses processive helicase activity 
and moves in a 3’ to 5’ direction. It had previously 
been observed that the helicase activity of bacterial 
DnaB was inhibited by the helicase loader, DnaC. In 
a conceptually analogous series of experiments, Kel- 
man and colleagues demonstrated that the M. therm- 
autotrophicus Orc1/Cdcé proteins could inhibit the 
helicase activity of MCM (103). Furthermore, they 
revealed that a direct physical interaction could be 
detected between Orc1/Cdc6s and MCM. Thus, 
while additional, as yet unidentified factors may play 
roles in the MCM-loading process, it is likely that the 
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Orc1/Cdc6 proteins will play an active role in the re- 
action. 

While the basis of MCM recruitment to origins 
of replication remains poorly defined at the molecular 
level, there have been a number of studies of the 
structure and function of the MCM complex. Most of 
the studies have been on the M. thermautotrophicus 
MCM, although some recent biochemical studies of 
Sulfolobus MCM have revealed insight into the 
mechanism by which the helicase affects movement 
along DNA. 


MCM 


Archaeal MCMs are large homomultimeric ma- 
chines that harness the energy released by ATP hy- 
drolysis to bring about melting of double-stranded 


DNA. The archaeal MCM monomer is approxi- 
mately 70 kDa and has the domain organization 
cartooned in Fig. 3, with an approximately 30-kDa 
N-terminal region, followed by an AAA* ATPase do- 
main and a poorly conserved helix-turn-helix at the 
C-terminal end of the protein. Electron microscopic 
studies have revealed that M. thermautotrophicus 
MCM forms ring shaped structures (22, 89, 113). Di- 
verse studies have described a range of different stoi- 
chiometries for the M. thermautotrophicus MCM; 
with single hexamer, double hexamer, and heptameric 
forms of the protein described. In addition, filamen- 
tous forms have been seen (21). This degree of struc- 
tural heterogeneity appears to be common to a num- 
ber of phylogenetically diverse ring-shaped helicases. 
For example, the gp4 helicase of bacteriophage T7 
has been observed in both hexameric and heptameric 
forms (105, 107). 


N-terminal domains HTH 


B-hairpin 


Figure 3. Domain organization of an MCM monomer. The crystal structure of the N-terminal region of M. thermautotrophi- 
cus has been solved. This region forms a double hexamer; for simplicity, only one hexamer is shown. The DNA-binding B-hair- 
pin structures are indicated. HTH, helix-turn-helix domain. Modified from Nature Structural Biology (34) with permission 


of the publishers. 
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In MCM, the central hole in the ring is of suffi- 
cient width to accommodate a double helix of DNA. 
The crystal structure of the first 286 amino acids of M. 
thermautotrophicus MCM has been solved and shows 
that this N-terminal section of the molecule has a six- 
fold symmetric structure that forms a double hexamer, 
with individual component hexamers aligning in a 
head-to-head manner (106). In the structure, a zinc- 
binding motif appears to be involved in mediating 
hexamer-hexamer interactions. Mutational analysis of 
this motif demonstrated little effect on the multimeric 
status of the protein, but the mutations did affect 
MCM binding of single-stranded DNA (91). 

The structure also revealed that each subunit in 
the component hexamers contributes a B-hairpin motif 
that points in toward the center of the ring (Fig. 3). 
Mutation of highly conserved basic residues at the tips 
of the hairpins reduces the ability of MCM to interact 
with DNA (34). Interestingly, the mutations have dis- 
tinct effects depending on the presence of the C-termi- 
nal domain of the protein. When these mutations were 
in a truncated (amino acids 1 to 286) M. therm- 
autotrophicus MCM, DNA binding was abrogated. In 
contrast, similar mutations in full-length Sulfolobus 
MCM resulted in a quantitative reduction, but not 
loss, of DNA-binding affinity and helicase activity. 
These data suggest additional DNA-binding sites are 
present in the C-terminal domain(s) of the protein (80). 

Inspection of the sequence of the AAA* domain 
of MCM revealed a striking similarity with helicases 
of the phylogenetically distinct Superfamily 3 (exem- 
plified by the large T antigen, Tag, of eucaryal simian 
virus 40 [SV40]). Tag has been shown to possess a 
B-hairpin insertion in the AAA* domain (39, 68). 
Furthermore, this hairpin has been observed to move 
during the ATP-binding and hydrolysis cycle of Tag, 
leading to the proposal that it brings about the power 
stroke of the helicase along DNA (39). A similar se- 
quence insertion has been noted in MCM, and muta- 
tional analyses have revealed that mutation of ab- 
solutely conserved basic residues in the proposed tip 
of this hairpin have only a modest effect on the DNA- 
binding activity of the enzyme but completely abol- 
ish the helicase activity of MCM (80). It appears, 
therefore, that MCM has two sets of DNA-binding 
B-hairpins, one in the N-terminal domain and one in 
the AAA* domain. Mutation of either alone only 
modestly reduces DNA binding, but when both hair- 
pins are mutated, the helicase can no longer bind 
DNA. By analogy with SV40 Tag, it has been pro- 
posed that the C-terminal hairpin drives MCM along 
DNA, with the N-terminal hairpin playing a minor 
role (if any) in this process. 

It is currently unclear what the precise mecha- 
nism is that enables MCM to mediate strand separa- 


tion during the helicase reaction. Kelman and col- 
leagues have elegantly demonstrated that M. therm- 
autotrophicus MCM can translocate along either sin- 
gle- or double-stranded DNA (104). They have also 
showed that the presence of a flap, in the form of a 
protruding DNA 5S’ end, allows MCM to peel this 
strand off the DNA (Fig. 4A). This could suggest that 
MCM acts as a molecular bulldozer, stripping the 
strands apart ahead of the main body of the enzyme. 
In a related model, Onesti and colleagues in their elec- 
tron microscopic studies (89) proposed that pores in 
walls of the ring-shaped helicase may serve as exit 
pores for the displaced strand (Fig. 4B). Finally, it is 
possible that, in vivo, MCM is acting as a double 
pump, with individual hexamers bound to double- 
stranded DNA and pumping against each other, 
thereby forcing DNA into the center of the helicase 
and leading to unwinding (Fig. 4C). Regardless of the 
mechanism by which MCM unwinds DNA, the result 
is the generation of single-stranded DNA that can act 
as a template for initiation and elongation of DNA 
synthesis. 


SINGLE-STRANDED DNA-BINDING PROTEINS 


Single-stranded (ss) DNA-binding proteins (SSBs) 
do not possess any enzymatic activity but have a struc- 
tural role. Their main function is to stabilize ssDNA by 
preventing the formation of secondary structures and 
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Figure 4. Models for the mechanism of DNA unwinding by the 
MCM helicase. (A) Single hexamer of MCM translocating along 
single-stranded DNA in a 3’ to 5' direction with the C-terminal 
AAA* domain leading. As it translocates, it unzips DNA ahead of 
it. A similar situation is shown in B, the difference being that the 
displaced strand is passed out through an exit channel in the MCM 
hexamer. An implication of this model is that the motor domain 
of MCM would bind to double-stranded DNA and the N-terminal 
domains would bind to single-stranded DNA. (C) A cutaway model 
of a double hexamer of MCM, with only two of the subunits of 
each hexamer shown. In this model, the two hexamers are held to- 
gether by the N-terminal domains, and rather than hexamers mov- 
ing on DNA, DNA is pumped into the central cavity of the double 
hexamer. Single-stranded loops of DNA are generated, and these 
are extruded from the body of the enzyme. 
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to protect it from chemical modifications by coating 
the unwound DNA. The cellular functions of SSBs 
have been best studied in Eucarya, where they are in- 
volved in various stages of DNA replication, recombi- 
nation, and repair, during which ssDNA occurs (112). 

SSBs bind DNA via a particular structure called 
an oligonucleotide/oligosaccharide-binding fold or 
OB fold (86). In bacteria, the functional SSB is a homo- 
tetramer that wraps 65 nucleotides. Each monomer 
possesses one OB fold and an acidic C-terminal tail 
involved in protein-protein interactions (92, 93). In 
eucarya, the SSB is called Replication Protein A, RPA. 
It consists of a heterotrimer formed by RPA70, RPA32, 
and RPA14, each subunit being named after its mol- 
ecular mass (13). RPA7O possesses four OB folds and 
a C-terminal zinc finger motif whose function is still 
unclear. One OB fold is found in each RPA32 and 
RPA14, but not all of these are necessary for DNA 
binding (8, 9). In archaea, the sequence arrangements 
of SSBs and their multimerization level vary between 
bacterial SSB-like and eucaryal RPA-like arrange- 
ments (Table 2). 

Euryarchaeal SSBs range from a monomeric SSB 
in Methanocaldococcus jannaschii, which has four 
OB folds and a zinc finger motif near its C terminus, 
reminiscent of that in RPA7O (58), to the heterotri- 
meric RPA from Pyrococcus furiosus, composed of 
RPA41, RPA32, and RPA14 (64). The largest sub- 
unit, RPA41, consists of an OB fold and a C-termi- 
nal zinc finger motif. RPA41 was coimmunoprecipi- 
tated with RadA, a RecA/Rad51 family protein, and 
a Holliday junction resolvase, suggesting a role in 
strand exchange during homologous recombination. 
RPA41 also coimmunoprecipitated with DNA repli- 
cation enzymes (64). More recently, three SSBs from 
the mesophilic archaeon Methanosarcina acetivorans 
were characterized: RPA1, RPA2, and RPA3 (95). 
Surprisingly, these three polypeptides did not form a 


heterotrimer but homomultimerized. Moreover, RPA1 
possesses four OB folds but no zinc finger motif, 
whereas RPA2 and RPA3 have two OB folds and a 
zinc finger motif. Taken together, these observations 
suggested that MacRPAs function independently of 
each other (95). 

By far the most studied archaeal SSB is that of 
the crenarchaeon S. solfataricus. It is a 16-kDa mono- 
mer with a sequence arrangement similar to that of 
bacteria, as it consists of a single OB fold and an 
acidic C-terminal tail (43, 110). However, the crystal 
structure of S. solfataricus SSB showed that its struc- 
tural architecture is more similar to the DNA-binding 
domain of RPA than to the bacterial SSB (63). The 
flexible C-terminal tail of S. solfataricus SSB is not in- 
volved in DNA binding but is believed to play a role 
in protein-protein interactions (110). Recently, this 
acidic tail was shown to interact with RNA poly- 
merase (Richard et al., 2004) and to overcome tran- 
scriptionally repressive effects of the chromatin protein 
Alba (3). Furthermore, S. solfataricus SSB increased 
the RecA-mediated DNA strand exchange in Es- 
cherichia coli (43) and activated the in vitro activity 
of S. shibatae reverse gyrase, even when DNA was 
coated with the chromatin protein Sul7d (88). 


DNA PRIMASES 


Because replicative DNA polymerases cannot 
synthesise DNA de novo, they require the presence of 
specialized enzymes to complete DNA replication. 
These enzymes, DNA primases, are by definition 
DNA-dependent RNA polymerases. They initiate 
DNA replication once only on the leading strand but 
several times on the lagging strand by synthesizing 
RNA primers de novo, which are then elongated by 
DNA polymerases to form the Okazaki fragments. 


Table 2. Composition of SSBs in Archaea, Eucarya, and Bacteria, highlighting differences in subunit composition and architecture 


Organism Form in solution Name of No. of OB folds Motif at C Reference 
each subunit per monomer terminus 
E. coli Homotetramer 1 Acidic tail 93 
Saccharomyces cerevisiae Heterotrimer RPA70 4 Zinc finger 13 
RPA32 1 
RPA14 1 
M. jannaschii Monomer 4 Zinc finger 58 
P. furiosus Heterotrimer RPA41 1 Zinc finger 64 
RPA32 1 
RPA14 1 
M. acetivorans Homotetramer/homodimer RPA1 4 95 
Homodimer RPA2 2 Zinc finger 
Homodimer RPA3 2 Zinc finger 
S. solfataricus Monomer or homotetramer 1 Acidic tail 43, 110 
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Bacterial primases, as exemplified by E. coli DnaG, 
are often physically associated with the replicative 
helicase. Indeed in some bacteriophage, such as T7, 
the primase and helicase are found ina single polypep- 
tide (36). 

Archaeal primases are heterodimers consisting of 
a catalytic and a regulatory subunit. Initial studies of 
the biochemical properties of the P. furiosus small cat- 
alytic subunit on its own gave the very surprising 
finding that the enzyme was capable of synthesizing 
2.4 kb long DNA, not RNA, products in vitro (10). 
However, subsequent work revealed that, when the 
regulatory subunit was present, the dimeric primase 
was then capable of synthesizing both DNA and 
RNA (70). Additionally, the DNA products were 
shorter (0.7 kb). Subsequent in vitro studies of P. 
horikoshii and S. solfataricus enzymes also showed 
that the primase can synthesize long RNA and DNA 
products de novo, suggesting that this property ex- 
ists in all archaeal primases (65, 75). In eucarya, the 
core primase heterodimer corresponds to the archaeal 
heterodimeric primase. However, in eucarya the core 
primase further associates with DNA polymerase a 
and the B subunit to form the pola-primase complex 
(reviewed in reference 67). Since archaeal primases 
are not stably associated with pola, it is tempting to 
suggest that their dual RNA and DNA synthesis ca- 
pability allows archaeal primases, first, to initiate the 
formation of RNA primers and, second, to extend 
these primers to form RNA-DNA hybrids. 

Until recently, only the crystal structure of the 
Pyrococcus primase catalytic subunit on its own was 
available (1, 50). The crystallographic analysis showed 
that the catalytic subunit consists of two domains: a 
larger a/B-domain, which includes the catalytic site 
(the prim domain), and a smaller a-helical domain of 
unknown function. Surprisingly, a zinc-binding mo- 
tif was also found not far from the catalytic site. Re- 
cently, the crystal structure of the heterodimeric pri- 
mase of S. solfataricus offered the first insight into the 
architectural organization of the primase large regu- 
latory subunit PriL (66). This subunit consists of an 
entirely a-helical domain forming the bulk of PriL 
and a smaller a/B-domain responsible for the inter- 
action with the catalytic subunit PriS (Fig. 5). In PriS, 
the a-helical domain is much smaller than that of the 
Pyrococcus enzyme and the zinc-binding motif com- 
paratively longer. However, the organization of the 
prim domain is conserved in the Pyrococcus and the 
S. solfataricus primases. 

When the Pyrococcus primase catalytic subunit 
structure was solved, it was found that the arrange- 
ment of the triad of catalytic aspartate residues was 
very similar to that of the DNA pol X family member, 
eucaryal pol B. Note, however, that the topology of 


Regulatory subunit 
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Figure 5. Crystal structure of the heterodimeric core primase of 
S. solfataricus. The two subunits are indicated, as are the catalytic 
aspartate residues. As the regulatory subunit is spatially removed 
from the catalytic site, it is proposed that this subunit exerts its 
effect by modulating primer length (see reference 66). 


the fold that contains the aspartate residues is unre- 
lated in primase and pol B, which is suggestive of con- 
vergent evolution of the protein structure (1, 67). 

DNA pol B is a member of the DNA pol X family 
that includes pol \, pol p, pol o and terminal deoxy- 
nucleotidyltransferase (TdT). These members of the 
pol X family play roles in DNA replication, repair, 
and recombination processes (49, 94). Surprisingly, 
the primase from S. solfataricus possesses a TdT ac- 
tivity (28, 65). Taken together with the fact that pol 
à, and pol p can initiate RNA and DNA synthesis de 
novo (94), it is likely that the particular arrangement 
of catalytic residues in primase and pol X has been 
selected for its enzymatic flexibility. Because most 
archaea do not possess a pol X family DNA repair 
polymerase, these data also suggest that it is possible 
that primase plays a role in DNA damage repair 
processes in archaea. 

Finally, it appears that the archaeo-eucaryal pri- 
mase catalytic fold may be derived from an evolu- 
tionarily ancient source (51, 67). Recent work has 
found distant relatives of this fold in diverse mole- 
cules, including the ORF904 product of the pRN1 
plasmid from S. islandicus (69), the primase-helicase 
of bacteriophage phBC6A51 (79), and the primase- 
like molecule, LigD, involved in NHEJ (29). 


REPLICATIVE DNA POLYMERASES 


DNA polymerases act to extend the primers gen- 
erated by primase complexes (49). These enzymes are 
divided into six families based on phylogenetic rela- 
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tionships: families A, B, C, D, X, and Y. Not all of 
these are replicative polymerases, however; some are 
utilized solely for DNA repair. The E. coli replication 
fork is built around the DNA pol III holoenzyme, a 
large assembly containing two copies of the family C 
DNA polymerase subunit, subunit a, which is en- 
coded by the duaE gene. One of these polymerase 
subunits is thought to replicate the lagging strand and 
one the leading strand (81). This was thought to be 
the case for all bacteria until two polymerases were 
identified in Bacillus subtilus; one of these encoded by 
a gene homologous to E. coli dnaE, dnaEgs and the 
other by polC. Pol C and DnaEgg are thought to be 
responsible for leading-strand and lagging-strand 
DNA synthesis, respectively (30). 

Eucaryal DNA synthesis requires three B-family 
polymerases: Pol a, Pol 8, and Pol e. Pol 6 and Pol e 
are the major replicative polymerases, with Pol 6 act- 
ing on the lagging strand and both acting on the lead- 
ing strand of DNA replication. Pol a is involved in 
lagging-strand Okazaki fragment synthesis by ex- 
tending primers, which are subsequently transferred 
to Pol 8 (37). 

Similar to eucarya, the replicative polymerases 
utilized in the Archaea were initially thought to be 
solely members of the family B polymerases based on 
predictions from complete genome sequences; for ex- 
ample, S. solfataricus was predicted to contain three 
family B members: Pol B1, Pol B2, and Pol B3. It was 
later established that although multiple B-family 
polymerases are a feature of several crenarchaea, such 
as Aeropyrum pernix, which has had two activities 
identified from cell extracts (14), euryarchaeal DNA 
replication requires a B-family polymerase and an- 
other, unique heterodimeric polymerase, Pol D. Pol D 
is composed of two subunits, DP1 and DP2, DP2 be- 
ing the polymerase domain and DP1 the 3’-5'-exo- 
nuclease domain (15). The archaeal B-family poly- 
merases were found to uniquely possess a “read 
ahead” recognition pocket for uracil bases (23) that is 
produced as a result of cytosine deamination. This re- 
sults in replication halting four bases from the primer/ 
template junction and prevents the mutation of C-G 
base pairing to A-T pairing (35). This feature may be 
particularly important for archaea that inhabit high- 
temperature environments, as deamination is more 
prevalent under these conditions, and, if undetected 
and uncorrected, genetic integrity would be rapidly 
lost by these organisms. 

Crystal structures have been determined for sev- 
eral of the archaeal B-family polymerases. These have 
revealed the classical “right hand” organization of 
the C-terminal domain, with catalytic residues pre- 
sent in the palm domain. The N-terminal domain has 
two subdomains, the first being the editing 3'-5'- 


exonuclease domain. The other N-terminal subdo- 
main forms a fold related to an RNA-binding motif 
and is the site of the uracil recognition pocket. 

The relationship between Pol B and Pol D in the 
euryarchaea has been investigated (44). It has been es- 
tablished that both are replicative polymerases, but 
they possess different biochemical properties and 
have been hypothesized to perform specific roles at 
the replication fork. Pol D is capable of primer ex- 
tension of both RNA and DNA primers, although not 
to full length. However, the addition of proliferating 
cell nuclear antigen (PCNA) (see below) stimulates 
this extension. Pol B, on the other hand, is only ca- 
pable of extending DNA primers; these are extended 
to the full length of the template in the absence of 
PCNA. The inability of Pol B to extend RNA primers 
cannot be overridden by the addition of PCNA. Pol D 
readily achieves strand displacement of DNA primers. 
However, displacement of RNA primers requires the 
additional presence of PCNA. In contrast, Pol B is in- 
capable of achieving strand displacement unless PCNA 
is present; even then, only DNA primers can be dis- 
placed. These data have given rise to a model whereby 
Pol B is the leading-strand polymerase, after initiation 
by Pol D, and Pol D is also the lagging-strand poly- 
merase (44). 

As the crenarchaea contain more than one B-family 
homolog, it is tempting to speculate that, as is seen in 
the Eucarya, the Euryarchaeota kingdom, and some 
members of the Bacteria, these homologs may also 
have specific and distinct roles at the replication fork. 


SLIDING CLAMPS 


Sliding clamps are proteins with no known en- 
zymatic activity, whose principal function is to in- 
crease the processivity or activity of proteins with 
which they interact by tethering them to duplex 
DNA. Although the amino acid sequences of sliding 
clamps are very different in the three domains of life, 
their three-dimensional structure is very similar: they 
all form a pseudohexameric ring-shaped structure 
that can accommodate a double-stranded DNA in its 
center (54) (Fig. 5). E. coli sliding clamp, the B-clamp, 
consists of a homodimer with each polypeptide ar- 
ranged in three domains to form a quasi-hexameric 
structure. In eucarya, the processivity clamp is called 
PCNA and is a homotrimer. Each monomer is formed 
of two domains that allow the formation of the 
pseudohexameric ring-shaped structure. Sliding 
clamps are well known for their role in DNA repli- 
cation, but they also interact with factors involved in 
other cellular processes, such as DNA repair and re- 
combination, and cell cycle regulators (reviewed in 
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references 60, 109, 111). The interaction of PCNA 
with other proteins occurs principally via a conserved 
motif, the PCNA-interacting protein box or PIP mo- 
tif, which is usually present at the N- or the C-terminal 
end of the proteins (111). Recently, the structural ba- 
sis for the interaction of the flap endonuclease FEN1 
and PCNA was uncovered in Archaeoglobus fulgidus 
(20) and in humans (98). 

PCNA is well conserved between Eucarya and 
Archaea, and there appears to be at least one PCNA 
homolog in each archaeal species. In the euryarchaea 
P. furiosus, M. thermautotrophicus, and Thermococ- 
cus fumicolans, only one homolog of PCNA was 
found that forms a homotrimer that activates the ac- 
tivity of various DNA polymerases (16, 46, 61). The 
elucidation of the crystal structure of P. furiosus PCNA 
suggested that the thermostability of this protein 
compared with that of eucaryal sliding clamps may be 
due to the increased number of ion pairs, and thus 
electrostatic interactions, within the protein (76). The 
crenarchaea appear to be slightly more complex, as 
they possess up to three PCNA homologs. In A. pernix, 
the PCNA subunits appear capable of forming both 
homotrimeric and heteromultimeric complexes (25). 
Although each PCNA is capable of increasing the 
activity of DNA polymerases, PCNA3 appears to be 
the most efficient. In another member of the Crenar- 
chaeota, S. solfataricus, PCNA exists as a hetero- 
trimer of three distinct subunits. No homomultimer- 
ization could be detected for individual subunits. The 
formation of the heterotrimer occurs in an obligate 
order: PCNA1 and PCNA2 interact first before 
PCNA3 can complete the ring (31). 

PCNAs are ring-shaped molecules and, to load 
onto DNA, require a ring-opening and -shutting re- 
action. In eucarya, the clamp loader, RF-C, mediates 
this reaction. Although archaea encode RF-C, many 
archaeal PCNAs are capable of self-loading onto 
DNA without the help of a clamp loader; the presence 
of RF-C does, nevertheless, stimulate the process (16, 
25, 61). 


CLAMP LOADER 


Generally, in all three domains of life, the clamp 
loader is a heteropentamer consisting of one large 
subunit and four smaller ones. Each of these 
monomers belongs to the AAA* ATPase family (27, 
52), and although ATP hydrolysis is not required to 
load PCNA onto DNA, the presence of one or more 
ATPs is necessary. 

In E. coli, the clamp loader is the y-complex and 
consists of three different subunits, with a stoichiom- 
etry of y366’, that form the pentamer (54-57). The 


eucaryal clamp loader is composed of five different 
subunits called RF-C1 to RF-C5 (24). Within these 
subunits, seven segments of amino acids (boxes II to 
VIII) are highly conserved. These include the Walker 
A and B motifs responsible for NTP-binding and 
ATPase activity, which are located in boxes III and V, 
respectively. The large subunit RF-C1 possesses an 
additional conserved domain in its N-terminal region 
(24). In comparison with the two other domains of 
life, the archaeal clamp loader is simpler because only 
two genes homologous to RF-C have been identified 
in fully sequenced genomes. This means that all ar- 
chaeal RF-Cs are composed of only two subunits: 
RF-C, (large subunit) and RF-Cs (small subunit). 

To date, a eucaryal-like pentameric organiza- 
tion of the clamp loader (4 RF-Cs, 1 RF-C,) has 
been identified in only two archaea, S. solfataricus 
and A. fulgidus (90, 99). In both cases, the RF-C 
complex increased the activity of DNA pol B and 
DNA pol D, in the presence of PCNA. In M. therm- 
autrotrophicus, the RF-C appears to be a hexamer 
capable of loading PCNA onto singly nicked circu- 
lar DNA and of activating DNA synthesis by pol B 
in the presence of PCNA (61). In P. furiosus and in 
P. abyssi, the composition of the RF-C complex is still 
unclear. It was proposed to be three to four RF-Css 
and one to two RF-C;s for P. furiosus (17) and 
either a hexamer (two RF-Cgs and four RF-C;s) or 
a trimer (two RF-Ccs and one RF-C;s) for P. abyssi 
(46). Both PfuRF-C and PabRF-C were capable of 
activating the activity of DNA pol D in the presence 
of PCNA (17, 45). 

For a long time, the loading of PCNA onto DNA 
by RF-C was believed to require ATP hydrolysis. 
However, studies on PabRF-C and AfuRF-C demon- 
strated that ATP binding to the clamp loader, but not 
hydrolysis, is necessary to load the processivity clamp 
onto DNA (45, 99, 100). More recently the impor- 
tance of ATP binding and hydrolysis by AfuRF-C for 
the loading of PCNA was studied in detail (100). The 
authors suggested that the initial binding of two mol- 
ecules of ATP by RF-C is necessary for its interaction 
with PCNA. This interaction then leads to the bind- 
ing of two additional ATP molecules and allows the 
loading of PCNA onto DNA. The subsequent hydro- 
lysis of three ATP molecules results in the release of 
RF-C from the PCNA-DNA complex, which allows 
the sliding clamp to bind other proteins (e.g., DNA 
polymerase) and to fulfill its processivity function. 
According to this model, the pentameric AfuRF-C 
can only bind four ATP molecules like the eucaryotic 
RF-C (40, 41). 

Recent structural snapshots have been obtained 
of the yeast (11) and archaeal (85) RF-C PCNA com- 
plexes by crystallography and electron microscopy, 
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respectively (108). In the archaeal structure, the 
horseshoe-shaped RF-C stacks on top of the ring- 
shaped PCNA. The large subunit of RF-C contacts 
one PCNA subunit, and two or three of the small sub- 
units contact the remaining two subunits of PCNA 
(85). The high-resolution yeast structure shows a sim- 
ilar arrangement. However, in this case the replica- 
tion factor C (RFC) small subunits, rather than lying 
flat on PCNA, spiral upward from the ring (11). It is 
possible, therefore, that RF-C mediates opening of 
the PCNA ring by a sideways pull on one or more 
subunits, resulting in opening of the ring into a struc- 
ture resembling a lock washer (Fig. 6). 


ARCHAEAL CELL CYCLE 


While an ever-growing body of data has yielded 
considerable insight into the form and function of the 
archaeal DNA replication machinery, much less is 
known about the details of the archaeal cell cycle and 
its control. Indeed, what little is known appears to 
be suggesting that diverse mechanisms may be em- 
ployed to regulate chromosome copy number, to co- 
ordinate DNA replication and cell division, and even 
to mediate the process of cell division itself. 

This latter point is emphasized by the finding 
that, while euryarchaeal genomes encode clear ho- 
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mologs of the bacterial FtsZ cell division protein, the 
crenarchaeal genomes do not. FtsZ is homologous to 
the key eucaryal cytoskeleton component, tubulin. 
FtsZ plays an essential role in the septation process 
in bacteria and presumably those archaea in which it 
is found. Bacterial FtsZ forms the so-called Z ring 
that forms at the junction of the two, soon-to-be 
daughter cells. The ring then contracts and thereby 
drives septation. FtsZ rings have also been observed 
in the archaeon Haloferax mediterranei, providing 
evidence for an analogous role for this protein in ar- 
chaea. It is intriguing that no clear FtsZ or tubulin ho- 
mologs are detected in the available crenarchaeal 
genomes and the identity of the proteins involved in 
cell division in these organisms remains unknown. 
Flow cytometric analyses have been performed 
on several archaeal species within the Euryarchaeota 
and Crenarchaeota. The crenarchaeal Sulfolobus 
species are among the best studied and have been 
shown to have a cell cycle that varies between one 
copy of the chromosome in newborn cells and two 
copies in a postreplicative period (perhaps analogous 
to eucaryal G2 phase of the cell cycle). The single 
chromosome (G1-like) phase of the cell cycle is very 
short and the G2 period is the dominant feature of the 
Sulfolobus cell cycle. Furthermore, cells in stationary 
phase have two fully replicated copies of the chromo- 
some (6, 48). This makes intuitive sense; an organism 


Figure 6. Model for loading of the PCNA clamp by RF-C. Binding of ATP by RF-C permits formation of the RF-C-PCNA com- 
plex. This leads to PCNA opening, presumably by repositioning of the RF-C subunits. DNA is loaded and the clamp resealed. 
None of these steps require ATP hydrolysis. ATP hydrolysis is, however, required for the next stage, recruitment of DNA 


polymerase (DNA pol) and release of RF-C. 
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living aerobically in a high-temperature environment 
will experience high levels of DNA damage. By biasing 
a one genome-two genome cell cycle to the phase with 
two copies of the chromosome, the organism enhances 
its chances of repairing DNA damage by the error-free 
method of homologous recombination. 

The euryarchaeon A. fulgidus, a high-temperature 
anaerobe, has a cell cycle distribution similar to Sul- 
folobus, with a short prereplicative phase and pro- 
longed postreplicative phase. However, in contrast to 
the tight control between one and two chromosome 
copies in Sulfolobus, Archaeoglobus cells can possess 
one, two, three, or four copies of the chromosome. 
Furthermore, conditions were observed in which the 
majority of Archaeoglobus cells in stationary phase 
had a single chromosome (72). 

Studies of Methanococcus jannaschii revealed 
that it had a complex number and pattern of copies of 
its chromosome, with possibly as many as 15 copies 
present during exponential growth, and between one 
and five copies in stationary-phase cells (74). 

Recent work has established that M. thermau- 
totrophicus possesses two, four, or eight copies of the 
genome (73). Although some four chromosome cells 
were present in stationary phase, the majority of cells 
had two copies of the chromosome. However, micro- 
scopy revealed that the two copies of the chromo- 
some present in stationary phase were present in dis- 
crete nucleoids. This contrasts with stationary-phase 
Sulfolobus cells, in which the two chromosomes ap- 
pear to be in a single nucleoid structure. In M. therm- 
autotrophicus the number of nucleoids observed cor- 
related well with the number of genomes present, 
leading to the proposal that, in this organism, chro- 
mosome segregation occurred rapidly after, or even 
concomitant with, DNA replication. This suggests 
that in contrast to eucarya, where, during G2, sister 
chromatids are paired postreplicatively, M. therm- 
autotrophicus may segregate its chromosomes in a 
manner akin to that seen in bacteria (73, 102). 
Whether this bacterial-like paradigm for chromosome 
segregation can be extended to other archaea is not 
yet known. 

While some important basic cell cycle parame- 
ters have now been established for a range of archaeal 
species, there is still very little information available 
about the mechanisms that drive cell cycle progres- 
sion and about the variation of protein levels during 
the cycle. This is due in part to the relative paucity of 
genetic tools for archaeal species and also due to the 
technical challenge of achieving good methods of cell 
cycle synchronization for archaeal cells. It has been 
possible to achieve partial synchronization of Sul- 
folobus acidocaldarius cultures by using treatment 


with acetic acid to arrest the cycle (by an as yet un- 
known mechanism) followed by release into fresh 
medium (71). A degree of synchrony has also been 
achieved for Halobacterium by using release from 
a block imposed by the DNA synthesis inhibitor, 
aphidicolin (47). 

The latter study, performed by Herrmann and 
Soppa, examined the expression profile of two ho- 
mologs of structural maintenance of chromosome 
(SMC) proteins (sph1 and hp24), which could play a 
role in DNA segregation, DNA repair, and/or cell di- 
vision (47). This study found evidence for cell cycle 
regulation of the abundance of both genes’ transcripts. 
In agreement with the transcript profile, protein levels 
of Sph1 also showed modulation. This study also ex- 
amined nucleoid distribution during the cell cycle, 
and the results suggested that chromosome segrega- 
tion was concomitant with DNA replication, as was 
proposed for M. thermautotrophicus (see above), in a 
mode akin to that employed by bacteria. 

Finally, as mentioned above, studies of the levels 
of Sulfolobus Orci/Cdc6 homologs revealed that 
Cdc6-1 and Cdc6-3 are highest in G1- and S-phase 
cells, while Cdc6-2 peaks in G2 cells (97). In bacteria, 
the DnaA replication initiator protein plays dual roles 
as a replication and transcription factor; whether this 
is the case for the Orc1/Cdcé6s is currently unknown. 
However, it is tempting to speculate that the fact that 
origins of replication lie immediately upstream of the 
cdc6-1 and cdc6-3 may provide a mechanism for co- 
ordinating expression of the initiator factors with ori- 
gin activity. The scheme described in Fig. 7 illustrates 
a simple binary oscillator that could coordinately 
modulate the activities of the origins and expression 
of initiator and regulator proteins. If the Orc1/Cdc6 
proteins also modulate the expression of additional 
downstream genes, then this simple loop could lie at 
the heart of the Sulfolobus cell cycle. Given the avail- 
ability of purified Orc1/Cdcé6 proteins and a defined 
in vitro transcription system for Sulfolobus, it should 
be possible to test the predictions of the model. 


PERSPECTIVE: THE NEXT FIVE YEARS 


Tremendous advances have taken place in our 
knowledge of the function of individual archaeal DNA 
replication proteins over the past few years. Clear 
goals for the near future lie in the integration of the 
individual components into coordinated in vitro 
systems designed to address the complex macromol- 
ecular interactions and transactions that mediate pro- 
cesses such as MCM loading and helicase-linked 
replication fork progression. 
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Figure 7. A model for the interplay between origin activity and transcriptional regulation of cdc6-1 and cdc6-3 genes. The 
model postulates that upon binding to the origins adjacent to their own genes, Cdc6-1 and/or Cdc6-3 exert negative regula- 
tion of their own expression. It is further proposed that the cdc6-2 gene becomes active at late S phase. This could conceiv- 
ably be by Cdc6-1 and/or Cdc6-3 activating cdc6-2 expression (not shown). In G2, remaining Cdc6-1 and Cdc6-3 protein 
levels decay and Cdc6-2 levels rise. Cdc6-2 could act as an activator of cdc6-1 and cdc6-3 and as repressor of its own expres- 
sion, thereby reducing its own levels, elevating Cdc6-1 and Cdc6-3, and preparing cells for another round of replication fol- 


lowing division. 


Beyond the challenges of understanding the 
mechanistic architecture of the replication appara- 
tus lies the necessity of improving our understand- 
ing of the molecular basis of the archaeal cell cycle. 
How is progression of the cycle driven; will post- 
translational modifications and targeted degradation 
of key proteins lie at the heart of the cycle, or will it 
be powered by comparatively simple transcriptional 
feedback loops? Given that the crenarchaeal and 
euryarchaeal cell division systems appear to have 
some fundamental differences, not the least being the 
presence or absence of FtsZ, it may be an oversim- 
plification to talk of an “archaeal” cell cycle. It is pos- 
sible that distinct mechanisms for controlling cell 
cycle progression exist in the two phyla. With ever- 
improving systems for the genetic manipulation for 
an increasing range of archaeal species and the power 
of transcriptomic and proteomic approaches, it can 


be anticipated that progress on these issues will be 
rapid and exciting. 
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Chapter 4 


DNA-Binding Proteins and Chromatin 


RACHEL SAMSON AND JOHN N. REEVE 


INTRODUCTION 


The genomes of all organisms have to be com- 
pacted to fit within the confines of a cell or nuclear 
compartment. In higher eucarya, two meters of DNA 
are condensed to fit within a nucleus that is ~10 pm 
in diameter, and bacteria and archaea must similarly 
compact their genomes to fit within a cell that is a 
thousand times shorter (25, 33, 53, 85). The very 
high concentrations of proteins and nucleic acids that 
are present in vivo aid DNA condensation by dis- 
rupting DNA-solvent interactions, and cations also 
neutralize many repulsive negative charges along the 
DNA backbone (53). Given such conditions of macro- 
molecular crowding and ionic environment, the issue 
in vivo is not how to collapse a large chromosome 
structure, but how to do so in a manner that pre- 
serves flexibility and access to the DNA molecule for 
replication and gene expression. In the three biologi- 
cal domains, several structurally unrelated families 
of chromatin proteins exist that have apparently 
evolved independently to facilitate genome conden- 
sation, prevent DNA aggregation, and allow expres- 
sion. As described, some of these chromatin proteins 
are present predominantly in one biological domain, 
but there are overlaps arguing for ancient common 
ancestries or lateral gene transfer. 


BACTERIAL CHROMATIN 
PROTEINS AND NUCLEOIDS 


Procaryotic genomic DNA and associated pro- 
teins together form an irregularly shaped structure, 
designated the nucleoid (33, 53). In Escherichia coli, 
this has been shown to have a central mass from which 
long supercoiled loops of transcriptionally active 
DNA extend (4). In many bacteria, including E. coli, 


members of the HU family are the most abundant 
chromatin proteins. The length of the DNA bound by 
different HU homologs is reported to range from ~9 
to 35 bp per dimer, with binding causing a sharp kink 
and minor groove widening and introducing nega- 
tive superhelicity (25, 37, 77). Cocrystals with DNA 
have revealed that HU dimers have two long flexible 
B-ribbon arms that extend from a central globular 
body and interact with the phosphodiester backbone 
within the minor groove of the DNA helix (77). E. coli 
mutants lacking HU are viable, indicating that other 
chromatin proteins can provide compensatory func- 
tions (25). HU-family members are also present in 
Giardia lamblia and dinoflagellates that could be 
remnants of endosymbiotic events (81, 86), as is al- 
most certainly the case for HU in plant plastids (69). 
Alternatively, these eucaryal HU-family members may 
have been acquired by horizontal gene transfer from 
bacteria, as was most likely the origin of the archaeal 
HU-family members present in the Thermoplasma 
lineage (22). 

The E. coli nucleoid contains several other chro- 
matin proteins, the most abundant being H-NS and 
FIS, which also function as transcription activators 
and repressors (3, 25). H-NS binds with high affinity 
to bent DNA structures and can constrain supercoils 
in DNA. FIS binds both nonspecifically and in a 
sequence-dependent manner to DNA, generating se- 
vere bends in the DNA helix. It is frequently reported 
that bacterial chromatin proteins change in abun- 
dance with growth phase (3, 24, 66), and in this re- 
gard, H-NS increases 5- to 10-fold, whereas FIS de- 
creases in abundance in E. coli cells as cultures enter 
the stationary phase. Such changes are likely to be re- 
lated to, if not responsible for, many of the changes in 
gene expression that occur in different growth phases, 
but this has not yet been definitively documented. 
Changes in the chromatin proteins present also occur 
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when bacteria differentiate. For example, Bacillus 
subtilis spores contain DNA-binding proteins, desig- 
nated small acid-soluble proteins (SASPs), that are not 
present in vegetative cells (71). The SASPs establish 
the structure of the spore nucleoid, protect the spore 
DNA from radiation and desiccation damage, and si- 
lence gene expression. Similarly, the nucleoid of the 
metabolically inert, infectious form of Chlamydia is 
compacted into a tight spherical structure by Hc1 and 
He2 (60), two proteins related to eucaryal histone H1 
that are not present in vegetative cells. 


EUCARYAL HISTONES AND CHROMATIN 


In contrast to the range of different chromatin pro- 
teins identified in bacteria, almost all eucaryal genomes 
are compacted into nucleosomes, chromatin, and chro- 
mosomes by essentially the same four proteins, his- 
tones H2A, H2B, H3, and H4. A nucleosome core 
consists of a histone (H3+H4), tetramer flanked on 
either side by a histone (H2A+H2B) heterodimer. 
This histone octamer binds and wraps ~146 bp of 
DNA in 1.65 negative superhelical turns to form a 
nucleosome unit (46). All four nucleosome core his- 
tones form the same structural motif known as the hi- 
stone fold (HF) with three a-helices (a1, «2, «3) sep- 
arated by two short B-strand loops (L1, L2) (46, 68). 
N- and C-terminal sequences extend from the HF that 
participate in higher-order nucleosome polymeriza- 
tion. They also provide the targets for posttranslation 
acetylation, methylation, phosphorylation, and ubiq- 
uitinylation events that regulate eucaryal chromatin 
structure and gene expression (18, 55, 74). Acetyla- 
tion of histone-tail lysine residues reduces histone- 
DNA affinity and interactions between neighboring 
nucleosomes, and this facilitates transcription factor 
access to the DNA and so relieves repression of gene 
expression imposed by chromatin structure. Histone- 
modifying enzymes, such as histone deacetylases and 
histone acetyltransferases, are therefore frequently 
identified as activators or repressors of eucaryal gene 
expression (52, 82). The DNA between eucaryal nu- 
cleosomes is bound by proteins, designated linker hi- 
stones H1 or H5, although these are structurally un- 
related to the nucleosome core histones and do not 
have HFs (38). 


ARCHAEAL CHROMATIN PROTEINS 


The following sections describe several different 
families of archaeal chromatin proteins with unre- 
lated structures, but with the common properties of 
abundance, small size, positive charge, and ability to 


bind to DNA with little or no sequence specificity (64, 
85). It is presumed that these proteins compact ar- 
chaeal genomic DNA in vivo and prevent DNA ag- 
gregation, and that they probably also participate in 
DNA replication, repair, and gene expression. These 
roles in vivo have not, however, yet been established 
experimentally and, in the one published report, the 
introduction of mutations into archaeal chromatin 
genes was not lethal (34). 


Archaeal Histones 


Archaeal histones were first discovered in metha- 
nogens (67), and then later in other members of the 
Euryarchaeota but, until very recently, were notice- 
ably absent from all Crenarchaeota (64, 68). This ar- 
gued that histones evolved after the divergence of the 
two major archaeal lineages, consistent hypotheses 
that posit that the eucaryal nucleus originated from 
an euryarchaeon (50). However, the Nanoarchaeum 
equitans genome was found to contain two histone 
genes, demonstrating that histones are present in this 
third very deep branching archaeal lineage (13), and 
histone-encoding genes have now also been identified 
in genomic DNA that most likely originated from a 
marine crenarchaeon, and also in the genome of the 
crenarchaeal marine sponge symbiont, Cenarchaeum 
symbiosum (19). Histones do therefore exist in meso- 
philic crenarchaeota but have apparently been lost, 
and presumably functionally replaced, in hyperther- 
mophilic crenarchaeota by the proteins, such as Sul7, 
Sul1 0a, and Sul10b (Alba), described in sections below. 

Most archaeal histones are just HFs without N- 
or C-terminal extensions (Fig. 1). The HF is stable 
only in a dimer configuration, and archaeal histones 
form both homodimers and heterodimers in vitro, 
and presumably also in vivo, in species that have mul- 
tiple histones (64, 68). In this regard, Methanosarcina 
species have the largest archaeal genomes but have 
only one histone gene, whereas the much smaller ge- 
nomes of Methanocaldococcus jannaschii and Metha- 
nosphaera stadtmanae encode six and seven different 
archaeal histones, respectively. The most thoroughly 
characterized archaeal histone is HMfB from Meth- 
anothermus fervidus (21), and this is the archetype 
of the most prevalent family of archaeal histones, 
proteins with 65 to 69 residues that form one HE. The 
HF is stabilized by an intramolecular salt bridge be- 
tween an arginine in L2 (R52 in HMfB) and an as- 
partate in a3 (D59 in HMfB) and by intermolecular 
hydrophobic interactions between a2 and a3 residues 
within the core of a dimer (Fig. 2A). In the presence 
of DNA, archaeal histone dimers associate to form 
tetramers that bind and wrap ~90 bp of DNA (5, 47, 
80, 87). The archaeal nucleosomes generated (61) re- 
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Figure 1. Phylogenetic tree based on rDNA sequence alignments of selected organisms. Branch lengths do not reflect evolutionary distances, but the branch- 
ing orders are correctly represented. The numbers of histones (H), Sul7d (S), Alba (A), MC1 (M), 7kKMk and HU-family members encoded in the genomes of 
representative archaea are denoted in parentheses. The H. salinarium NRC-1 histone has two HFs in one polypeptide (*). M. kandleri has two members of 
the HMfB family of archaeal histones with one HF (^), and also HMk, an archaeal histone with two HFs. 
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Figure 2. (See the separate color insert for the color version of this illustration.) Sequences and structures of representative 
archaeal chromatin proteins. Primary sequences of HMfB from M. fervidus (A), Sul7d (Sac7d) from S. acidocaldarius (B), Alba 
(Sso10b1) from S. solfataricus (C), and MC1 from Methanosarcina sp. CHTISS (D) are shown below the corresponding pro- 
tein structure. The figure was constructed using structures available from the Protein Data Bank (11). Regions with a-helical 
and 8-strand structures are colored identically in the sequence and in the corresponding structure. 


tain the flexibility to wrap DNA in either a positive 
or negative supercoil (47), and this flexibility may 
help accommodate the activities of DNA polyme- 
rases, RNA polymerase, and topoisomerases that 
generate local DNA supercoiling. Histones bind to 
the sugar-phosphate backbone of DNA (46) and so 
can incorporate any DNA sequence into a nucleo- 
some, although DNA sequences that readily distort 
to accept alternating major and minor groove com- 
pressions are preferentially bound. Consistent with 
genome sequences having evolved to facilitate their 
own packaging, dinucleotide repeats that facilitate 
nucleosome assembly are overrepresented in the 
genomes of histone-containing species, including ar- 
chaea (6, 70). In M. fervidus, the archaeal histones 
HMfA and HMfB constitute ~4% of total soluble 
protein, sufficient for one histone tetramer per ~67 
bp of genomic DNA, assuming a genome size of 
~1.7 Mbp. Chromatin immunoprecipitation studies 
indicate that most genomic sequences in this hyper- 
thermophilic archaeon are associated in vivo with hi- 
stones (61, 62). 


Eucaryal histone (H3+H4) heterodimers are 
asymmetric (46). This asymmetry positions surface- 
located DNA-binding residues appropriately and also 
prevents higher-order histone dimer-dimer oligomer- 
ization. In this regard, two archaeal histones, MkaH 
in Methanopyrus kandleri and HHb in Halobac- 
terium salinarum NRC1, have inherently asymmet- 
ric HF “dimer” structures with two nonidentical HFs 
tandemly linked in one polypeptide chain (Fig. 1; ref- 
erence 64). This guarantees HF heterodimer forma- 
tion and an asymmetry that could direct nucleosome 
assembly (27, 58). How archaeal histone homodimer 
polymerization is limited in Methanosarcina species 
that have only one archaeal histone gene is an in- 
triguing question. 

Archaeal histones that do have C-terminal ex- 
tensions have been identified in methanococcal species 
(64). The extensions are ~25 residues in length and 
are predicted to fold into a-helices, but they share no 
sequence identity with eucaryal histone tails. Deletion 
of the C-terminal extension reduced the thermosta- 
bility of such a histone (MJ1647) from M. jannaschii, 
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but increased its ability to bind DNA (43). Appar- 
ently, in vivo, the C-terminal extension of MJ1647 
obstructs its homopolymerization and so helps direct 
the assembly of archaeal nucleosomes in M. jannaschii 
with asymmetric heterotetramer histone cores. 

The only amino acid within the HF of a eucaryal 
nucleosome core histone that has been shown to un- 
dergo posttranslation regulatory modification is K69 
in the expanded L1 region of histone H3 (12). In- 
triguingly, NEQ0288, one of the two histones in N. 
equitans (13), has a related sequence including a ly- 
sine residue present at the same location within L1. 
It will be interesting to determine whether this lysine 
is a target for regulatory methylation in NEQ0288 
as is K69 in H3 (12). 


Sul7d 


Sul7d is the generic name given to a family of 
very abundant, highly conserved ~7-kDa proteins (1, 
44). Individually, they are designated Sac7d in Sul- 
folobus acidocaldarius, Sso7d in Sulfolobus solfatar- 
icus, and Sto7d in Sulfolobus todakaii (Fig. 1). These 
hyperthermophilic crenarchaeotes do not have his- 
tones. Members of the Sul7d family constitute up to 
~5% of total cellular protein. They bind DNA non- 
cooperatively with one monomer contacting ~4 bp of 
DNA (15, 29, 44, 51). Binding unwinds the DNA and 
introduces negative superhelicity which could com- 
pensate for the overwinding of DNA that occurs in- 
herently at hyperthermophilic growth temperatures 
(see Chapter 19). When bound to closed DNA mole- 
cules, Sul7d unwinding of the duplex increases DNA 
writhe and compacts positively supercoiled and re- 
laxed DNAs, but decompacts negatively supercoiled 
DNA molecules (44, 54). 

NMR solution and X-ray crystal structures have 
been solved for Sul7d-DNA complexes. The barrel- 
shaped protein consists of a double-stranded B-sheet 
packed against a triple-stranded B-sheet, followed by 
a C-terminal amphiphilic a-helix (29, 40) (Fig. 2B). 
Residues in the triple-stranded B-sheet contact four 
bases in the minor groove of DNA, with the hy- 
drophobic side chains of V26 and M29 intercalating 
between bases and causing a ~66° bend in the DNA 
(1, 63, 76). The protein binds largely through non- 
polar interactions with deoxyribose units in the mi- 
nor groove, although electrostatic interactions, salt 
bridges, and hydrogen bonds also occur that together 
provide strong DNA binding with a low degree of se- 
quence specificity (8). 

Although the mechanism and biological func- 
tion(s) are unknown, 5 of the 14 lysine residues in 
Sul7d are specifically monomethylated in vivo. This 
methylation does not change the DNA affinity or 


thermal stability of the protein (76). In addition to its 
likely role in chromosome compaction, Sul7d has a 
surprising number of additional activities. It facili- 
tates the annealing of complementary single-stranded 
DNAs and has RNase, ATP-dependent protein chap- 
erone, and protein renaturation activities (30, 31, 72). 


Sul10b/Alba 


Hyperthermophilic Sulfolobus species contain 
Sul10a and Sul10b, in addition to Sul7d, and whereas 
Sul7d and Sul10a have limited phylogenetic distribu- 
tion, Sul10b is widely distributed throughout the Ar- 
chaea. Individual proteins have been designated Sso10b 
in S. solfataricus, Ssh10b in S. shibatae, Sac10b in 
S. acidocaldarius, Afu10b in Archaeoglobus fulgidus, 
and Mja10b in M. jannaschii (Fig.1; references 20, 
32, 83, 84, and 89). Sso10b has also been designated 
Alba, based on the observation that acetylation low- 
ers DNA-binding affinity (10). Crystal structures of 
Sso10b/Alba dimers from S. solfataricus, A. fulgidus, 
and M. jannaschii have revealed a B1-01-82-a2-B3- 
B4 monomer secondary structure with an overall 
topology similar to that of the C-terminal domain of 
E. coli initiation factor IF3, and the N-terminal DNA- 
binding domain of DNase I (83, 84, 89). The Alba 
monomers are associated in an antiparallel orienta- 
tion, via residues in the w2-helix and the B3- and B4- 
strands. The dimer has two f-hairpin structures ex- 
tending from a globular central body (Fig. 2C). Alba 
dimers bind DNA cooperatively and can constrain 
DNA into negative supercoils (10, 35, 88). In one 
model proposed for DNA binding, the central body 
of the dimer is positioned across the major DNA 
groove, and the B-hairpins extend to make contacts 
within the flanking minor grooves (84). 

A very intriguing observation is that Alba (Sso10b) 
from S. solfataricus is acetylated at K16 in vivo, and 
acetylated Alba has a 3-fold lower affinity for DNA 
than nonacetylated Alba. The enzymes responsible 
for Alba acetylation and deacetylation in S. solfatar- 
icus have been characterized, and acetylation and 
deacetylation in vitro has been established (10, 48, 
49). When deacetylated Alba binds to the template 
DNA, transcription in vitro is repressed and this re- 
pression is relieved by acetylation (10). This is remi- 
niscent of the acetylation and deacetylation of lysine 
residues in eucaryal histone tails regulating eucaryal 
chromatin structure and gene expression (18, 52, 74, 
82). K16 acetylation may not directly reduce Alba 
affinity for DNA, but rather may modulate Alba 
dimer-dimer interactions required for Alba polymer- 
ization and so binding along a DNA duplex. Regard- 
less of the molecular details, this indicates that chro- 
matin structure, and thereby gene expression, might 
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be regulated in archaea by chromatin protein modifi- 
cations. To date, Alba acetylation has only been docu- 
mented in S. solfataricus, and not all Alba family mem- 
bers have a lysine at the site of K16 in Sso10b (89). 

A bioinformatics analysis concluded that Alba is 
a member of a superfamily of proteins, most of which 
are involved in RNA metabolism, for example, sub- 
units of RNase P, RNase MRP, and Mdp2 (2). To- 
gether with the very wide distribution of Alba pro- 
teins in archaea, this has led to the hypothesis that 
Alba may be an RNA-binding protein in some ar- 
chaea that has been recruited in other lineages to be- 
come a chromatin protein (2). Alba does bind to both 
DNA and RNA in Sulfolobus species (32, 49), but 
chromatin immunoprecipitation experiments argue 
convincingly that Alba is bound to genomic DNA and 
functions as a chromatin protein in S. solfataricus (49). 

As is often the case with chromatin proteins, 
some archaea have more than one Alba (Albal = 
Sso10b1; Alba2 = Sso10b2 [17, 35]). In A. fulgidus 
and M. kandleri, the two Alba proteins have very sim- 
ilar sequences, consistent with recent gene duplication 
events, but in S. solfataricus, S. tokodaii, S. acidocal- 
darius, and Aeropyrum pernix, the Alba1 and Alba2 
sequences are only 30 to 40% identical. In S. solfa- 
taricus, in stationary phase, Alba1 is 20-fold more 
abundant than Alba2, suggesting that Alba2 may 
have a regulatory rather than structural function. In 
vitro, (Alba1+Alba2) heterodimers bind simulta- 
neously to two DNA duplexes at a protein to DNA 
ratio of 1 dimer/12 bp, which results in DNA com- 
paction (35). DNA is similarly compacted by bind- 
ing of Alba1 homodimers to two DNA duplexes at 
this ratio, but when increased to 1 dimer/6 bp, Albal 
homodimers coat the DNA to form nucleoprotein fil- 
aments that result in little or no DNA compaction. To 
form such a filament, Alba1 homodimers assemble 
end to end (84, 89). The residues at this dimer-dimer 
interface are not conserved in Alba2, and Alba2 ho- 
modimers and (Alba1+Alba2) heterodimers do not 
form nucleoprotein filaments (17). The biological 
function of Alba2 may therefore be to form (Alba1+ 
Alba2) heterodimers that limit nucleoprotein filament 
formation by Alba1 homodimers. Alba2 limiting 
Alba1 oligomerization could be another mechanism 
of regulation, based on chromatin structure, added to 
the regulation based on K16 acetylation-deacetyla- 
tion of Albal in S. solfataricus (85). 


Sul10a 


Sul10a is the generic name of an abundant ~11 
kDa DNA-binding protein investigated from S. aci- 
docaldarius (Sac10a) and S. solfataricus (Sso10a) (16, 
26, 36). X-ray crystal and NMR solution structures 


have established that Sso10a dimers are highly elon- 
gated, with a two-stranded antiparallel coiled-coil rod 
central region separating two globular winged-helix 
DNA-binding domains. This structure is similar to 
structures established for MotA, CueR, and ModE 
bacterial transcription regulators, and for a DNA 
replication terminator protein, RTP. Consistent with 
Sac10a being a sequence-specific DNA-binding pro- 
tein, it exhibits a pronounced binding preference for 
poly(dA/dT) over poly(dG/dC) or E. coli genomic 
DNA. However, winged-helix DNA-binding motifs 
are not only present in sequence-specific transcription 
factors (see Chapter 6). This structure is also present 
in the eucaryal H1 linker histone (38). Electron mi- 
croscopy has revealed that Sac10a binds and coats 
double-stranded DNA at high protein to DNA ratios, 
and under such conditions Sac10a binding does intro- 
duce supercoiling into closed circular DNA molecules. 


MC1 


Methanogen chromosomal protein 1 (MC1) is an 
abundant ~93-amino-acid residue chromosomal pro- 
tein characterized from Methanosarcina that, based 
on genome sequencing, is also present in Halobac- 
terium species (Fig. 1). An NMR solution structure 
has been established for MC1 from Methanosarcina 
sp. CHTIS5 (56). This has a barrel-shaped protein fold 
similar to that of Sul7d, with a double-stranded B-sheet 
linked by an a-helix to a triple-stranded B-sheet (Fig. 
2D). Photochemical cross-linking of MC1-DNA com- 
plexes has established that the DNA-binding region 
is located between K69 and K87. This region contains 
part of a conserved loop located between the second 
and third B-strands of the triple B-sheet and three 
conserved lysine residues and one tryptophan residue 
that likely interact directly with DNA (56). MC1 
monomers bind noncooperatively to double-stranded 
DNA kinking the DNA to an angle of 116° (14, 23, 
42). Although MC1 binding to DNA appears to be 
sequence independent, preferential binding has been 
observed with some DNAs that most likely have read- 
ily deformable sequences (57). MC1 has strong affin- 
ity for bent, negatively supercoiled and four-way junc- 
tion DNAs (78, 79). DNase I protection results 
suggest that the DNA wraps around MC1, and, al- 
though no structure has yet been reported for an 
MC1-DNA complex, models have been proposed in 
which MC1 binds to both the inside and outside of a 
curved DNA molecule (56). 


7kMk 


In addition to the MkaH histone, M. kandleri 
also contains three other chromatin proteins with 
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molecular masses of ~7, 10, and 30 kDa (59). The 
7-kDa protein, designated 7kMk, has been character- 
ized and shown to be a member of the ribbon-helix- 
helix family. It is homodimer in solution and likely 
folds into a structure with an N-terminal B-strand fol- 
lowed by four a-helices (59). 7KMk binds to double- 
stranded DNA without any apparent sequence speci- 
ficity, forming looplike structures and introducing 
negative supercoils. Two binding models have been 
proposed in which the DNA is either wrapped around 
a preformed 7kMk protein core, or 7kMk binds to 
the DNA cooperatively, gradually bending the DNA 
into a left-handed superhelical loop (59). 


HU-Related Proteins 


Members of the Thermoplasma lineage are eury- 
archaeota that do not have histones, but rather have 
HU-related chromatin proteins (65). HTa from Ther- 
moplasma acidophilum was, in fact, the first archaeal 
chromatin protein to be studied in detail (22). The 
sequence of this 89-amino-acid residue protein shows 
a common ancestry with bacterial HU proteins. HTa 
binding to DNA reduces the contour length of DNA 
and stabilizes the DNA against heat denaturation. 
Nucleoprotein particles were calculated to have an 
HTa tetramer core that bound and circularized 40 bp 
of DNA (21), but this structure now seems very un- 
likely, given the extreme DNA distortion that circu- 
larizing 40 bp would require. It seems most likely that 
HTa binds to DNA and forms complexes similar to 
those formed by bacterial HU proteins (25, 77), but 
this remains to be determined. 


Participation of Chromatin Proteins 
in Transcription and DNA Metabolism 


It is apparent that many different chromatin pro- 
teins have evolved, all of which must bind and com- 
pact DNA into complexes that are readily disassem- 
bled, or that are inherently compatible with DNA 
replication and transcription machineries. It is now 
understood that chromatin proteins that were origi- 
nally considered to have only architectural functions, 
such as HU in bacteria and the eucaryal nucleosome 
core histones, actually participate in regulating gene 
expression (3, 18, 25, 55, 74), and it has been pro- 
posed that differences in bacterial versus eucaryal 
chromatin structure result in fundamentally different 
mechanisms of regulating gene expression (75). In eu- 
carya, histone-compacted chromatin is considered to 
be generically repressive. Gene expression therefore 
requires transcription activators, for example, histone 
acetylases that help disassemble chromatin and so al- 
low transcription factor access to the DNA (18, 52, 


55, 74, 82). In contrast, the nucleoid structure in bac- 
teria is argued not to prevent transcription factor ac- 
cess, and therefore promoter-specific repressors are 
needed to prevent inappropriate gene expression (75). 
The validity and extensions of these arguments to ar- 
chaea remain to be determined. The archaeal tran- 
scription initiation machinery does most closely re- 
semble the eucaryal RNA polymerase (RNAP) II 
system (see Chapter 6), but most of the transcription 
regulation documented to date involves repressors 
rather than activators (7, 9, 64). Archaeal transcrip- 
tion initiation in vitro is inhibited by archaeal histone 
assembly of the promoter into an archaeal nucleo- 
some, but downstream archaeal nucleosomes do not 
block archaeal RNAP elongation after initiation (87). 

As most archaeal histones lack N- and C-terminal 
tails, and given the absence of genomic evidence for 
archaeal homologs of eucaryal chromatin-remodeling 
complexes, and the direct evidence that archaeal his- 
tones isolated from M. jannaschii do not have post- 
translation modifications (28), it seems unlikely that 
gene expression in archaea is regulated by histone 
modification. Archaeal histones could, however, still 
participate in regulating gene expression as different 
archaeal histones in the same species have different 
affinities for different DNA sequences (6, 47). So by 
controlling their relative abundances, and possibly 
also the extent of homodimer versus heterodimer 
formation, archaeal nucleosome assembly could be 
specifically positioned in vivo to regulate gene ex- 
pression (68). By using chromatin immunoprecipita- 
tion (62), after in vivo histone-to-DNA crosslinking, 
it should be possible to establish where nucleosomes 
are located throughout an archaeal genome under dif- 
ferent growth conditions, and therefore, how such 
positioning relates to gene expression. 

As described, Alba (Sso10b1) binding to DNA 
is modulated by posttranslation acetylation in S. sol- 
fataricus and, in this archaeon, gene expression may 
well be regulated by modifying chromatin structure 
(10, 35, 884). It is clearly important to determine 
whether such regulation extends beyond Sulfolobus 
species, in particular, to species where the residue in 
Alba at the position homologous to K16 in Alba1/ 
Sso10b1 is not a lysine (87). Some archaeal genomes 
encode homologs of the Sir2 deacetylase that deacety- 
lates Alba1/Sso10b1 in S. solfataricus that do not en- 
code Alba proteins. 

Few archaeal DNA replication/repair investiga- 
tions to date have studied the participation of ar- 
chaeal chromatin proteins (39, 41, 63), although 
Sul7d has been shown to modulate the primer exten- 
sion and excision activities of the S. solfataricus 
proofreading DNA polymerase B1 (PolB1). Sul7d 
apparently binds to and stabilizes double-stranded 
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DNA, inhibiting primer cleavage by PolB1 and en- 
couraging primer extension (45). The presence of 
Sul7d did not affect cleavage of single-stranded DNA 
or the proofreading ability of the polymerase, sug- 
gesting that Sul7d may play a role in balancing the 
polymerization versus exonuclease activities of this 
DNA polymerase in favor of polymerization. Bind- 
ing Sul7d to the DNA did not inhibit DNA unwind- 
ing by the S. solfataricus replicative MCM helicase. 
However, DNA unwinding by this helicase was in- 
hibited by Alba1, and this inhibition was reduced by 
acetylation of Sso10b/Alba, consistent with acetyla- 
tion reducing the affinity of Alba1 for DNA (48). 
Studies with the Methanothermobacter thermauto- 
trophicus MCM helicase have established that this 
enzyme can unwind DNA bound by an archaeal his- 
tone into an archaeal nucleosome, or bound into a 
transcription preinitiation complex, but this enzyme 
could not unwind DNA held in a transcription elon- 
gation complex (73). 


PERSPECTIVE: THE NEXT FIVE YEARS 


With the accumulation of genome sequences, it is 
now apparent that most archaea have the capacity to 
synthesize several different chromatin proteins. Some 
of these proteins (e.g., histones, Alba) have wide dis- 
tribution, whereas others (e.g., Sul7d, 7kMk, MC1) 
have restricted phylogenetic distributions (Fig. 1). 
Based on bacterial precedents (25), it seems likely that 
many of these proteins will have compensating and 
overlapping functions in chromatin organization, but 
probably very specific and noncompensating func- 
tions in regulating gene expression. Also based on 
bacterial precedents, archaeal chromatin proteins are 
probably not individually essential for viability, but by 
mutational inactivation, their roles in specific DNA 
metabolic events may be identified. Currently, the 
consequences of adding purified archaeal chromatin 
proteins to in vitro DNA replication, repair, and tran- 
scription systems are being determined, and these ex- 
periments will identify specific reactions that are in- 
hibited or stimulated by chromatin proteins binding 
to the template DNA. Crosslinking chromatin pro- 
teins to genomic DNA in vivo, followed by the isola- 
tion and identification of the DNA bound by micro- 
array hybridizations, will identify where chromatin 
proteins are located in vivo, and this will (or will not) 
be correlated with specific gene expression. But to 
prove the role of a chromatin protein in vivo in any 
complex process will require substantially more so- 
phisticated genetics than is currently available for ar- 
chaea. However, the development of such genetic 
technology is on the horizon (see Chapter 21), and 


with such developments there will be rapid progress in 
understanding and dissecting the detailed structural 
and regulatory roles of archaeal chromatin proteins. 
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Chapter 5 


Mechanisms of Genome Stability and Evolution™ 


DENNIS W. GROGAN 


INTRODUCTION: THE SIGNIFICANCE 
OF GENETIC PROCESSES IN ARCHAEA 


When researchers describe the scientific motiva- 
tions to study archaea, certain themes tend to recur. 
One is the evolutionary centrality of the Archaea, re- 
flecting the very early origin of this domain (122). This 
centrality implies that reconstructions of early cellu- 
lar evolution must account for the observed molecular 
properties of archaea, and, conversely, that individual 
molecular properties of archaea should be interpreted 
with respect to the wider evolutionary framework. 
Another theme emphasizes the uniqueness of archaea, 
in particular, those from “extreme” environments. 
Many of these archaea have proven to be exceptional 
organisms, and a number of them define physiologi- 
cal limits of life on earth. Archaea have been a rich 
source of phenomena that have revised certain rules 
of biology and could not be discovered in the well- 
established bacterial and eucaryal model species. 

Both the evolutionary centrality and the unique- 
ness of archaea have proven their validity and heuristic 
value over nearly three decades of research, but in jux- 
taposition these two ideas also create certain interest- 
ing tensions. For example, the uniqueness of archaea 
would seem, logically, to extend to their evolution. Un- 
less evolutionary forces and mechanisms are intrinsi- 
cally immutable across biology, archaeal evolution 
should exhibit certain features not evident in bacteria 
or eucarya, reflecting the extreme molecular and cellu- 
lar divergence among the three domains. Does archaeal 
evolution in fact deviate from the picture(s) evident 
from research on bacteria and unicellular eucarya? 

This broad, almost philosophical, question can 
be reduced to more concrete terms if one considers 
the roles of genetic processes in the molecular evolu- 
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tion of microorganisms. According to modern Dar- 
winian theory, a fundamental “separation of powers” 
distinguishes processes that produce genetic variation 
in natural populations (mutation, migration, and re- 
combination) from processes that filter it out (drift 
and selection); the resulting interplay determines the 
rate and course of change (Fig. 1). Properties of mu- 
tation and recombination can be measured under lab- 
oratory conditions and are expected to contribute 
more or less directly to the “nearly neutral” variation 
that dominates evolution at the molecular level (88). 
Even in cases where selection dictates the successful 
phenotype, properties of the genetic processes may still 
determine the route taken to reach it. In addition, the 
five primary processes depicted in Fig. 1 are them- 
selves affected by aspects of the organism’s biochem- 
istry, physiology, and ecology and can be expected, 
conversely, to shape the genetic properties of a mi- 
crobial lineage over evolutionary time. 

As detailed throughout this volume, archaea 
have many molecular and cellular features not seen in 
bacteria or eucarya, and many archaeal lineages have 
apparently been living for a very long time in envi- 
ronments considered (even by microbiological stan- 
dards) to be “extreme,” “harsh,” or “unusual.” It thus 
seems legitimate and significant to ask whether the 
third form of life employs a third form of genetics in 
its daily survival and evolutionary diversification. 
This question also serves to unify and organize the 
admittedly limited data on basic molecular-genetic 
processes in archaea, and to identify for future atten- 
tion topics of particular significance. Accordingly, this 
chapter examines the “natural genetics” of methano- 
genic, halophilic, and thermophilic archaea, progress- 
ing from the molecular scale to cells and finally to 
populations. Related topics addressed in other chap- 
ters include DNA replication (see Chapter 3), the evo- 
lutionary aspects of archaeal genomes revealed by 
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Figure 1. The five processes that drive genetic change in natural populations. Each horizontal line represents a distinct genotype 
(i.e., genome sequence) present in a hypothetical population of a microbial species as a function of time. New genotypes arise 
in this population by mutation, immigration from other populations, and recombination. Conversely, genotypes are eliminated 
randomly by drift, or according to functional properties determined by particular alleles, by selection. As a result of these 
natural processes, the genetic composition of the population changes irreversibly over time. For a comprehensive discussion, 


see Ridley (88). 


sequencing (see Chapter 19), and experimental tech- 
niques used to manipulate archaea genetically (see 
Chapter 21). 


SPONTANEOUS MUTATION 


All DNA replication systems make mistakes, and 
the possible mistakes outnumber the correct base at 
each position in the nucleotide sequence. Thus, the er- 
ror rate and the molecular nature of the errors made 
represent two fundamental genetic properties that 
can vary among microorganisms and affect genome 
stability and evolution. The isolation of “mutator” 
strains provides a practical example of this. In hap- 
loid microorganisms, a single mutation can inactivate 
an important accuracy-enforcing mechanism (such as 
DNA mismatch repair), yielding a strain with an ele- 
vated rate of spontaneous mutation and an altered 
mutational spectrum (13, 52). These mutator strains 
seldom exhibit serious growth defects; in fact, certain 
selections and growth regimens can actually give 
them a selective advantage, which facilitates their iso- 


lation (64). However, the fact that nearly all microor- 
ganisms isolated from nature have multiple accuracy- 
enforcing mechanisms that are not essential for basic 
cell viability implies that success in nature requires a 
higher level of genetic fidelity than does growth of 
pure cultures under laboratory conditions. 

The importance of genetic accuracy for evolu- 
tionary fitness becomes more obvious when mutation 
rates are compared for microorganisms ranging from 
bacteriophage to bacteria to fungi. Replication of 
these genomes involves widely varying error rates per 
base pair (1077 to 1071! events per generation), but 
remarkably constant rates per genome (about 0.003 
events per replication) (15). This conservation of a 
rate per genome at the expense of the rate per base 
pair implies a strong selective force in nature that re- 
wards this genomic rate as generally optimal. Ac- 
cordingly, this genomic rate provides a biologically 
meaningful reference for evaluating the replication fi- 
delity of a microorganism. Does this selective force 
extend to archaea, and, if so, can archaea growing in 
harsh, potentially mutagenic conditions attain the 
prescribed level of fidelity? 
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The technical demands for accurate mutation- 
rate measurement have so far been satisfied only in 
the thermoacidophilic crenarchaeon Sulfolobus aci- 
docaldarius. Essentially all mutations conferring re- 
sistance to 5-fluoro-orotic acid (FOA) inactivate the 
pyrE or pyrF gene of this species and can be detected 
with high efficiency by plating. For the greatest accu- 
racy, genic rates calculated from fluctuation tests 
using maximum-likelihood methods (47) are cor- 
rected for the effects, if any, of phenotypic lag and rel- 
ative fitness of mutant versus parent. In S. acidocal- 
darius, this approach showed that pyrE and pyrF 
mutations occurred at a combined rate of 3.3 X 1077 
events per cell division. This corresponds to about 
2.6 X 1077 phenotypically detected mutational events 
per kilobase pair (kbp), which matches the average 
value calculated from several metabolic genes of Es- 
cherichia coli (42). Thus, replication of these two tar- 
get genes in S. acidocaldarius at 75 to 80°C exhibits 
fidelity comparable to that of similar genes in E. coli 
growing at 37°C. 

Calculating the mutation rate of the entire ge- 
nome requires two additional parameters: (i) the frac- 
tion of mutations that go undetected, and (ii) the 
genome size. Ideally, parameter (i) is determined em- 
pirically from the spectrum of spontaneous mutation. 
To ensure the S. acidocaldarius genomic rate was not 
underestimated, the pyrF gene (site of relatively few 
mutations) was excluded as a target, and parameter 
(i) was estimated more generously for the pyrE gene 
than indicated from the actual spectrum (36). Para- 
meter (ii) was estimated from published flow cytom- 
etry data to within one percent of the value later de- 
termined by sequencing (9, 36). The resulting genomic 
rate, 0.0018 mutational events per replication, falls 
slightly below the highly conserved value of 0.003 ob- 
tained for diverse mesophiles (15). This result con- 
firms that a strong, but as yet ill-defined, selection for 
a common error rate per genome applies to unicellu- 
lar life across three domains and contrasting ecologi- 
cal niches. It also demonstrates that hyperthermophilic 
archaea can meet this optimal level of accuracy de- 
spite the genotoxicity of their growth conditions and 
their peculiar situation with respect to DNA repair 
genes (see “DNA repair,” below). In fact, despite in- 
corporating assumptions that would tend to overes- 
timate the genomic rate, the calculations depict S. 
acidocaldarius as having the most stable genome of 
any organism known so far. To the extent that mole- 
cular evolution consists primarily in the accumulation 
of nearly neutral mutations (88), this result suggests a 
relatively slow molecular clock for lineages of Sul- 
folobus and perhaps other archaea that share the ge- 
netic properties of S. acidocaldarius (see “Transpos- 
able genetic elements,” below). 


One qualification of these conclusions has yet to 
be confirmed experimentally, namely, the extent to 
which the pyrE gene is representative of the S. acido- 
caldarius genome. However, it is possible to estimate 
the likelihood of this by assuming that the overall er- 
ror rate of replicating a typical S. acidocaldarius gene 
replication is determined primarily by the length of its 
longest mononucleotide tract, which is consistent 
with the mutational spectrum of pyrE (described be- 
low). By this criterion, only S. acidocaldarius genes 
containing tracts of eight or more A (T), or tracts of 
six or more G (C), are expected to mutate more fre- 
quently than pyrE. Analysis of the genome sequence 
identifies only 27 A (T) tracts and only 151 G (C) 
tracts meeting these criteria (D. Grogan, unpublished 
results). This predicts a higher mutational load from 
a maximum of about 190 of the 2,200 genes pre- 
dicted in the S. acidocaldarius genome. The quanti- 
tative effect on the genomic mutation rates is more 
difficult to estimate, but is likely to be modest. For ex- 
ample, assuming an average tenfold elevation over 
the pyrE rate in the 190 genes, counterbalanced by a 
tenfold reduction in the approximately 700 genes 
having mononucleotide tracts shorter than those in 
pyrE, yields less than a twofold increase in genomic 
mutation rate. 

A spectrum of spontaneous mutation provides 
additional information about genetic fidelity, includ- 
ing (i) which molecular errors occur during DNA 
replication in vivo and escape correction, (ii) the rel- 
ative frequencies of these various uncorrected errors, 
and (iii) any association of particular errors with par- 
ticular DNA sequences. Spectra of loss-of-function 
mutations have been determined for genes of E. coli, 
Saccharomyces cerevisiae, and other well-studied mi- 
croorganisms, enabling S. acidocaldarius to be com- 
pared with known systems in molecular terms. Al- 
though each of the target genes listed in Table 1 may 
be subject to some idiosyncrasy, trends emerge that 
suggest functional differences in the accuracy of DNA 
replication and repair. In S. acidocaldarius, for exam- 
ple, expansions and contractions of mononucleotide 
and triplet repeats account for the vast majority of 
spontaneous mutations. This is an obvious contrast 
to the S. cerevisiae spectrum, but it also differs from 
the E. coli spectrum, in that the pyrE mutations occur 
at several sites, as opposed to a single “hot spot” in 
the lacI gene (38). Expansions and contractions of 
short repeated sequences like these are attributed to 
slippage of the nascent and template strands during 
DNA synthesis; migration of the resulting mismatch 
back along the nascent strand helps these events es- 
cape correction by polymerase proofreading (32). 

Conversely, two types of mutations appear to be 
less frequent in S. acidocaldarius than in E. coli and 
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Table 1. Comparison of mutational spectra across the three domains* 


Mutation S. acidocaldarius pyrE E. coli lacl S. cerevisiae URA3 
A:T to G:C 1.0 0.7 5.5 
G:C to A:T 79 6.2 22.2. 
Total transitions 8.9 6.9 27.8 
A:T to T:A 2.0 1.0 21.1 
A:T to C:G <1 1.0 5.5 
G:C to T:A <1 1.8 14.4 
G:C to C:G 1.0 0.4 14.4 
Total transversions 3.0 4.1 55.5 
-1 61.4 4.0 8.9 
+1 17.8 1.1 1.1 
+block?’ 3.0 72.0 1.1 
Total frameshifts 82.2 771 11.1 
Other duplications 5.0 1.0 4.4 
Other deletions 1.0 9:9. 1:1 
TE insertions® <1 1.1 <1 


“Data are percentages of independent, spontaneous mutants, representing 101 pyrE mutants of S. acidocaldarius (36), 729 lacI mutants of E. coli (38), and 


90 URA3 mutants of S. cerevisiae (57, 98). 


PIncludes all mutations in which a short (2- to 5-bp) unit is added to, or subtracted from, a naturally occurring repeated sequence, even when this does not 


change the reading frame. 
TE, transposable genetic element. 


S. cerevisiae. The first represents insertions by trans- 
posable genetic elements (TEs), which are common in 
E. coli (38) and less common in S. cerevisiae (91). Ab- 
sence of TE insertions in the S. acidocaldarius spec- 
trum reflects a dearth of complete insertion sequences 
in this species and apparent inactivity of the few that 
do occur (9). Also underrepresented in the S. acidocal- 
darius spectrum are deletions. As demonstrated by 
analysis of a larger set of mutants, deletions account 
for only about 0.4% of all spontaneous mutations in 
pyrE and do not exhibit any statistical association with 
direct repeats or inverted repeats at their end points 
(37). In contrast, bacterial and eucaryal systems pro- 
duce deletions at much higher rates, and the majority 
of these occur between short direct repeats and in- 
verted repeats (13, 29, 38). Thus, the common modes 
of deletion formation in bacteria and eucarya seem to 
be inefficient in S. acidocaldarius, and this may en- 
hance genome stability. In particular, the genomes of 
Sulfolobus species and other archaea have clusters of 
short, regularly spaced repeats (SRSRs) (78). In S. aci- 
docaldarius, these SRSR clusters consist of 4 to 133 
copies of 24-bp direct repeats separated by short, 
unique sequences (9). This length (24 bp) is enough to 
encourage the deletion of tandem duplications in 
S. acidocaldarius (37). It will thus be of interest to 
measure the genetic stability of SRSR clusters (which 
have been proposed to play roles in chromosome seg- 
regation) in S. acidocaldarius or other archaea. 


Transposable Genetic Elements 


Mutations caused by TEs do not reflect molecu- 
lar properties of genome replication and repair, but 
they occur frequently in many archaea and can have a 
major impact on genome evolution. The most abun- 
dant TEs in archaeal genomes are insertion sequences 
(ISs), which represent the smallest TEs capable of 
independent transposition (7). Most ISs consist of a 
transposase-encoding gene flanked by short inverted 
repeats (IRs), which provide the recognition and 
cleavage sites for the transposase. In the genome, 
these sequences are typically flanked by even shorter 
direct repeats (DRs) that represent duplication of the 
target site upon transposition (62). Due to the high 
informational density of the DNA in most unicellu- 
lar hosts, ISs often inactivate genes when they propa- 
gate, and, unlike antibiotic-resistance transposons, 
they encode no extra genes that can benefit the host 
directly. Thus, ISs seem to represent a classic exam- 
ple of “selfish DNA.” 

The mesophilic extreme halophile Halobac- 
terium salinarum (which encompasses strains previ- 
ously designated Halobacterium halobium, Halobac- 
terium cutirubrum, and Halobacterium salinarum) 
contains several families of insertion sequences, each 
present in multiple copies (81). Some of these ISs 
transpose with high frequency among the various 
plasmids and lower G+C-content chromosomal DNA 
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of this species, which together constitute about 30% 
of the total genomic DNA (80, 93, 94). As a result, 
the function of some environmentally important 
genes, notably those for bacterio-opsin (bop), and gas 
vacuoles (vac) are inactivated at very high frequencies 
(80). At least two of these IS families, designated 
ISH23 and ISH27, have also been shown to generate 
deletions at high frequencies (82). 

In methanogens, analysis of genomic DNA se- 
quences has identified a diversity of ISs, and only one 
species, Methanothermobacter thermautotrophicus 
(formerly called Methanobacterium thermautotro- 
phicum), has so far been found to lack them com- 
pletely (7). In addition to these native elements, TEs 
from eucarya have been used as genetic tools in 
methanogens (see Chapter 21). 

Multiple, diverse ISs are also the rule in the 
genomes of hyperthermophiles. In particular, the 
genome of Sulfolobus solfataricus contains more ISs 
than any other bacterial or archaeal genome reported 
so far. About 10% of the 2.99 Mb S. solfataricus P2 
genome consists of ISs representing at least 10 fami- 
lies present in an average of 25 copies each. Most of 
these ISs copies appear to be intact (7) and to trans- 
pose actively, as indicated by the frequency of spon- 
taneous rearrangements detected in the course of 
genome sequencing (105). Genetic assays confirm 
that transposition of several different ISs cause the 
vast majority of spontaneous pyrE and pyrF muta- 
tions in the related strain S. solfataricus strain P1. As 
a result, the rate of forming pyrE and pyrF mutants 
is 10 to 100 times higher in S. solfataricus than in 
S. acidocaldarius (65). Additional ISs have been found 
by examination of other Sulfolobus isolates. For ex- 
ample, six of seven ISs found to transpose in Sulfo- 
lobus “islandicus” were distinct from the ISs detected 
previously in S. solfataricus strains by sequencing or 
genetic assay (5). 

The observed phylogenetic diversity of ISs in Sul- 
folobus is accompanied by diversity of functional 
properties. Transposition frequencies of seven ISs in 
various S. “islandicus” isolates varied over a 250-fold 
range. The ISs also differed in target-site specificity, 
although the region between the TATA box and the 
transcription start site of pyrE seemed to represent a 
preferred region for transposition in both S. solfatar- 
icus and S. “islandicus” (5, 65). A more striking dif- 
ference among the ISs in S. “islandicus” related to 
precise excision. This is a significant form of TE ex- 
cision, because it represents the primary route to 
restoring gene function when a TE has transposed 
into a gene and has been shown to be mediated by 
DNA-metabolizing systems of the host, rather than of 
the TE (61, 85). Only one of the seven S “islandicus” 
ISs evaluated was lost precisely from pyrE, thereby 


restoring growth without uracil; ironically, this was 
also the only IS having no IRs or DRs to facilitate 
deletion (5). Precise removal of this IS (provisionally 
called ISC1926) is consistent with the fact that it be- 
longs to a family of TEs which transpose via precise 
excision of a circular intermediate that subsequently 
integrates at a new site (62). Conversely, the fact that 
none of the other ISs were deleted precisely at any de- 
tectable frequency (i.e., about 5 X 10~1!°) remains 
consistent with the observation that short, hyphen- 
ated (i.e., nontandem) IRs and DRs do not promote 
deletion in another Sulfolobus species (37). 

The observed failure of typical Sulfolobus ISs to 
excise precisely even under strong selection implies 
that when they insert into a host gene, the loss of gene 
function is normally irreversible. This has important 
implications for genome evolution (see Chapter 19), 
because in the absence of other mechanisms (such as 
transfer of an intact copy of the gene from another 
cell), this property effectively shields IS insertions 
from selection for removal. This does not mean that 
IS insertions cannot be lost, but it does predict that 
the loss will normally be incremental and slow and 
will not restore gene function. BLAST analyses show 
that certain ISs are represented only by fragmentary 
relics in the sequenced Sulfolobus genomes (5), and 
that other ISs present as multiple, full-length copies 
are also accompanied by numbers of nonfunctional 
copies bearing distinct mutations (44, 105). This pat- 
tern seems consistent with slow removal of ISs by in- 
cremental mutation causing accumulation of inactive 
copies, although others have postulated a mechanism 
that specifically inactivates ISs when they become too 
numerous in a genome (7). Another pattern seen in 
S. solfataricus is the congregation of ISs in certain re- 
gions of the chromosome and insertion into each 
other (7). This may be explained by the abundance 
and activity of ISs in this species, which results in the 
largest and least deleterious target sites for IS trans- 
position being provided by other ISs. 

In addition to full-length ISs and inactive IS frag- 
ments, archaea harbor short elements that resemble 
miniature inverted-repeat transposable elements, or 
MITEs (84). This type of TE was first discovered as 
abundant, 50- to 200-bp repeated sequences in 
eucaryal chromosomes (75). The mobility of MITEs 
is thought to reflect complementation by ISs, since 
MITEs possess IRs matching those of full-length, 
functional ISs present elsewhere in the genome (75). 
Two classes of MITEs are distinguished by the nature 
of the interior sequence: Type I MITEs retain se- 
quences of the corresponding helper IS, as though 
they were derived from a related IS by partial deletion 
of the transposase gene. Type II MITEs have se- 
quences between the IRs that are unrelated to those of 
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the helper IS, and a basis for their formation remains 
unclear. Like the corresponding ISs, at least some of 
the MITEs found in Sulfolobus species can transpose 
and inactivate genes, as demonstrated experimentally 
in S. “islandicus” (5). 


DNA REPAIR 


All cells have multiple, distinct systems of special- 
ized enzymes that cooperate to keep the genome intact 
and ready for replication. Figure 2 summarizes these 
systems according to the biochemical strategy they 
employ to deal with DNA damage (for comprehensive 
review, see reference 26). Certain forms of damage 
(alkylation and UV-induced dimerization) are simply 
reversed, whereas oxidized or fragmented bases, vari- 
ous helix-distorting adducts, and mismatched bases 
must be excised and replaced. Should the DNA dam- 
age escape repair by these mechanisms, two additional 
strategies allow replication to proceed anyway: trans- 
lesion synthesis (TLS) by specialized DNA poly- 
merases, which are often inaccurate and thus muta- 
genic, and homologous recombination (HR). These 
last two strategies do not actually remove the DNA le- 
sion and are thus more accurately termed “tolerance 
pathways.” The components of these seven pathways 
have been defined by experimental analysis of bacte- 
ria and eucarya (26), and homologs of the correspond- 
ing genes can be found in all three domains (Table 2). 

According to computational analysis of complete 
genome sequences, putative genes for alkyltransferase 
(AT), base-excision repair (BER), and HR occur in es- 
sentially all archaeal genera; representative examples 
are summarized in Table 2. In many cases, the corre- 
sponding biochemical function has also been detected. 
For example, alkyltransferase and DNA N-glycosy- 
lase activities have been detected in cell-free extracts 
of thermophilic archaea (50, 111) and in proteins ex- 
pressed from archaeal genes in E. coli (10, 56, 60, 95). 
HR has been detected by genetic assays in several ar- 
chaea (see “Genetic recombination,” below), and has 
been shown to require radA, the archaeal homolog 
of eucaryal RADS1 and bacterial recA genes, in 
Haloferax volcanii (124). The RadA proteins of other 
archaea catalyze nucleoprotein filament formation 
and strand-exchange reactions in vitro (69, 102), and 
the radA gene of H. volcanii was also shown to con- 
tribute to survival of DNA damage (124). Highly ef- 
ficient HR in Pyrococcus and related archaea is indi- 
cated by survival of extremely high doses of ionizing 
radiation and reconstruction of a complete chromo- 
some from the resulting small DNA fragments (14). 

Putative photoreactivation (PR), TLS, MMR, 
and NER genes present a more complex distribution 


in which the number of DNA repair systems in ar- 
chaea seems to decline with increasing growth tem- 
perature. For example, genes encoding DNA pho- 
tolyases are evident in only about half of the archaeal 
genera included in Table 2. The validity of this indi- 
cation from genome sequences is reinforced by the 
fact that PR has been demonstrated by experiment in 
each of the three archaea encoding putative pho- 
tolyases (20, 45, 67, 68, 123). The DNA photolyase 
of M. thermautotrophicus was also purified to homo- 
geneity and characterized in vitro (46). However, lack 
of photoreactivation has not been confirmed in the 
archaea that lack putative photolyase genes. Genes 
encoding error-prone family-Y DNA polymerases 
show a similar pattern, being evident primarily in 
mesophilic archaea and Sulfolobus spp. (Table 2). 
Mutagenesis of S. acidocaldarius by a wide range of 
DNA-damaging agents provides indirect evidence in 
vivo of error-prone TLS (86, 123), and assays in vitro 
have confirmed the bypass properties of family-Y 
DNA polymerase from Sulfolobus species (6, 48). 
X-ray crystallographic structures of two such en- 
zymes have been resolved and suggest a structural ba- 
sis for frameshifting during TLS (59, 108). However, 
the archaea that lack these family-Y polymerases 
have apparently not been examined for a lack of dam- 
age-induced mutagenesis. 

Proteins responsible for MMR in bacteria and 
eucarya show similarity to the corresponding E. coli 
proteins MutS and MutL, whereas NER systems are 
of two distinct types. Bacterial NER requires only 
three proteins (UvrABC), whereas eucaryal NER re- 
quires at least seven (83). The bacterial and eucaryal 
NER proteins have low similarity to each other, but 
within their respective domains, they are well con- 
served (18). Among archaea, homologs of bacterial 
and eucaryal NER genes co-occur, and the three re- 
pair systems (MMR, bacterial NER, and eucaryal 
NER) seem to be distributed according to the optimal 
growth temperature of the species (Table 2). Thus, 
most archaea growing optimally below about 55°C 
encode genes representing a complete MMR system, 
a complete bacterial NER system, and a partial eu- 
caryal NER system missing RAD14/XPA, RAD 
4/XPC, RAD 23/hR23b, and RAD10/ERCC1 ho- 
mologs. With one exception so far (M. thermauto- 
trophicus), archaea growing optimally above about 
55°C lack MMR genes, bacterial NER genes (uvr- 
ABC homologs), and at least one of the eucaryal 
NER-gene homologs found in mesophilic archaea, 
without gaining homologs of any additional eucaryal 
NER genes. 

A few of the thermophilic archaea, primarily Sul- 
folobus species, have been examined for their ability 
to repair DNA damage in vivo, with no obvious de- 
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Figure 2. Molecular strategies for coping with DNA damage. Schematic summary of the molecular events associated with dam- 
age reversal, damage excision, and damage tolerance. Abbreviations: AT, alkyl transfer; PR, photoreactivation; BER, base excision 
repair; NER, nucleotide excision repair; MMR, mismatch repair; TLS, trans-lesion synthesis; HR, homologous recombination. 
Some of the processes (HR, in particular) are shown greatly simplified; for comprehensive review, see Friedberg et al. (26). 
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Table 2. Representation of major DNA repair pathways in microbial genomes* 


Genus Tar AT PR BER NER* MMR TLS HR? 

Archaea 
Methanosarcina 37 + + + Uvr + + RadA 
Halobacterium 50 + + + Uvr + + RadA 
Thermoplasma 59 ? No T No No No RadA 
Methanothermobacter 65 + + + Uvr No No RadA 
Sulfolobus? 80 + + + No No + RadA 
Archaeoglobus 83 + No + No No No RadA 
Pyrobaculum® 98 + No + No No No RadA 
Pyrococcus 98 F No F No No No RadA 

Bacteria 
Escherichia 37 + + + Uvr + + RecA 
Thermotoga 80 F No F Uvr + No RecA 

Eucarya 
Saccharomyces 30 + + + Rad + + Rad51 


“Compiled from The Institute for Genomic Research Comprehensive Microbial Resource (TIGR CMR, version 16.0; http://pathema.tigr.org/tigr- 
scripts/CMR/CmrHomePage.cgi). In the table, “+” indicates that sufficient relevant COGs (gene families) are represented in the genome to form a func- 


tional pathway; “?” 
abbreviations, see the legend to Fig. 2. 
>Temperature (°C) yielding fastest growth (approximate). 


indicates a match of marginal significance (P > 10~7), and “No” indicates that the necessary homologs are not present. For pathway 


e“Uvr” indicates a complete pathway of the bacterial type; “Rad” indicates a complete pathway of the eucaryal type. 
4Family of recombinase found; see Sandler et al. (92) for molecular differences among the families. 
“Members of the Crenarchaeota; all other archaea listed are members of the Euryarchaeota (122). 


fects detected (see “An alternative form of NER?” 
below). The simplest reconciliation of these two ob- 
servations (i.e., absence of genes required for MMR 
and NER on one hand, and absence of obvious 
DNA repair defects on the other), would seem to be 
that the thermophilic and hyperthermophilic archaea 
have alternative molecular strategies that assume the 
function of classical MMR and NER systems but do 
not involve homologous proteins. Meanwhile, spe- 
cific experimental evidence supporting or contra- 
dicting the idea of alternative MMR or NER strate- 
gies in hyperthermophilic archaea remains limited 
and remarkably balanced, as summarized in the next 
paragraph. 


An Alternative Form of MMR? 


The primary evidence for postreplicational 
MMAR in hyperthermophilic archaea comes from 
the low rate of spontaneous mutation measured in 
S. acidocaldarius, which is about 0.1% of mutation 
rates of MMR-deficient bacteria (42, 64). There is 
so far no evidence that other Sulfolobus species 
replicate their DNA less accurately than S. acido- 
caldarius does, since the higher mutation rates re- 
ported for them can be attributed to IS transposi- 


tion. However, a distantly related crenarchaeon, Py- 
robaculum aerophilum, has been claimed to be a 
natural mutator, based on the length heterogeneity 
of mononucleotide tracts in its genomic DNA (24). 
Nearly all (16 of 18) of the longest mononucleotide 
tracts of this organism were recovered in multiple 
lengths by cloning, and most of the variable sites 
were located between, not within, predicted genes. 
The available data neither exclude nor prove funda- 
mental genetic differences between S. acidocaldar- 
ius and P. aerophilum. Differences are suggested, 
for example, by failure to observe length hetero- 
geneity in the mononucleotide tracts of S. acidocal- 
darius during genome sequencing (L. Chen, per- 
sonal communication), despite the fact that several 
of these are as long as the variable tracts in P. aero- 
philum. On the other hand, the Pyrobaculum DNA 
used for cloning came from a large population of 
cells (23), suggesting the possibility of accumulated 
spontaneous mutations. In addition, the sponta- 
neous mutation spectrum of S. acidocaldarius is 
dominated by the types of slipped-strand events 
that characterize MMR-deficient mutants of bacte- 
ria and eucarya (32, 36). Note that S. acidocaldar- 
ius could conceivably achieve a low overall muta- 
tion rate in the absence of MMR, provided (i) the 
genome has relatively few long mononucleotide 
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tracts and (ii) extremely accurate proofreading oc- 
curs within the replicative polymerase complex. As 
described above (Spontaneous mutation), prelimi- 
nary examination of the S. acidocaldarius genome 
sequence supports (i). With respect to (ii), the ob- 
served error rate of replicating pyrE, 7.8 x 10~1!° 
per bp, is only about 15-fold lower than that of bac- 
teriophage T; replication, which does not involve 
postreplicational MMR (16). Thus, a replicative 
complex with 15-fold more efficient proofreading 
than the T, replicase could, in theory, explain the 
mutational properties of S. acidocaldarius (J. Drake, 
personal communication). 


An Alternative Form of NER? 


The broad conservation of NER systems among 
bacteria and eucarya argues that this form of repair, 
which typically operates on large, helix-distorting 
DNA lesions, is fundamental to evolutionary fitness 
(18). In addition, the conservation of the latter por- 
tion of the eucaryal pathway in all archaea suggests 
that these “downstream” enzymes, primarily heli- 
cases and structure-specific endonucleases, have a 
function that is being maintained by selection. In par- 
ticular, the only NER protein homologs found in all 
eucarya and consistently absent from hyperther- 
mophilic archaea are those that bind to the DNA le- 
sion and thus initiate the process of excising it as an 
oligonucleotide. Accordingly, the only “novel” DNA 
repair proteins that may need to be hypothesized for 
many archaea represent alternatives to the eucaryal 
damage-recognition proteins (Table 2). However, this 
argument based on gene conservation leads to a very 
different conclusion when applied to the mesophilic 
and moderately thermophilic archaea. These archaea 
encode complete, apparently functional NER systems 
of the bacterial type (Table 2) (67, 68, 74). Thus if the 
eucaryal-NER homologs indeed have a cellular func- 
tion in these species, it would seem to be distinct from 
that of the bacterial NER homologs, which these ar- 
chaea also encode. 

Experimental evidence for NER in hyperther- 
mophilic archaea remains limited. Survival curves in- 
dicate that S. acidocaldarius has about half of the 
dark-repair capacity for UV photoproducts of Uvr* 
E. coli (123), which would be a significant level of 
NER. In addition, UV photoproducts are converted 
in the dark to apparently recombinogenic lesions that 
are removed at physiological temperature but not at 
room temperature (101). However, the S. acidocal- 
darius genome has recently been shown to encode a 
UVDE homolog (9), and the predicted function of 
this protein is to initiate removal of UV photoprod- 
ucts by an alternative pathway (127). Thus, this gene 


product could explain the observed UV responses of 
S. acidocaldarius. Biochemical activities of the eu- 
caryal-NER homologs encoded by S. solfataricus 
have been confirmed to be consistent with “down- 
stream” (i.e., late) events of NER (89), and T; en- 
donucleaseV-sensitive sites disappear from the DNA 
of UV-irradiated S. solfataricus cells incubated in the 
dark at physiological temperature (73). Since the 
genomes of S. solfataricus and most other archaea en- 
code no UVDE homolog (9), the latter result provides 
perhaps the strongest experimental evidence to date 
for some alternative form of NER in hyperthermophilic 
archaea. It remains unclear, however, whether this 
system is functionally equivalent to conventional 
NER. Like UVDE, for example, it may use an alter- 
native mechanism, or it may lack the broad substrate 
range of conventional NER. 


Uracil-Sensitive DNA Synthesis 


As a rule, archaea encode multiple DNA poly- 
merases of the B family, at least one of which is pre- 
sumed to replicate the genome. Because of technolog- 
ical interest, most attention has been focused on the 
DNA polymerases of hyperthermophilic archaea, and 
these were first shown by Lasken et al. (55) to have 
the unusual property of stalling at uracil residues in 
the template strand. This is now known to result from 
a uracil-specific binding pocket in the polymerase that 
scans the template strand 4 to 6 nt ahead of the cat- 
alytic site (31). The biological function commonly 
proposed for this feature is avoidance of transition 
mutations initiated by dC deamination. However, as 
the molecular properties of archaeal DNA replication 
and BER become clear (see Chapter 3), a more com- 
plex rationale for this phenomenon seems increas- 
ingly necessary. 

For example, systematic measurements of bind- 
ing affinity reveal the striking molecular specificity of 
the phenomenon. The special pocket found in ar- 
chaeal polymerases binds only uracil, not structurally 
similar derivatives, and only when incorporated into 
ssDNAs longer than about 15 nt (107). Thus, stalling 
is neither induced by other potentially mutagenic 
pyrimidine bases, nor by uracil in a context distinct 
from a replication fork. The fact that the complex 
stalls with the uracil residue in ssDNA protected by 
the enzyme (25) would seem to pose serious obsta- 
cles to BER. Repair strategies that involve dissocia- 
tion of the polymerase must initially protect the uracil 
from BER, because it is within ssDNA. Regression of 
the replication fork would allow the uracil-containing 
strand to reanneal to its original partner so that BER 
could repair the lesion, although this may be compli- 
cated by the proximity of the template uracil residue 
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to the nascent 3’ end. Alternatively, displacement of 
the replicative polymerase by a specialized TLS poly- 
merase would, in principle, allow for accurate repli- 
cation past the uracil, in a manner analogous to UV 
photoproducts (6, 117). Another possibility is that a 
uracil DNA glycosylase breaks the replication fork by 
excising the uracil from the ssDNA, thereby trigger- 
ing double-strand-break repair (via HR) and re- 
assembly of the replication fork (51). In any case, as- 
says in vitro indicate that, without some form of 
rescue, the stalled complex should eventually insert 
an A opposite the U and continue (25). 

These various mechanistic possibilities neverthe- 
less fail to explain why uracil-induced stalling should 
occur only in family-B DNA polymerases of archaea. 
The need for BER at high temperatures remains ob- 
vious, but there is no evidence that hyperthermophilic 
archaea do not excel at this form of DNA repair. 
These archaea typically encode multiple, diverse 
DNA N-glycosylases which exhibit high activity in 
cell extracts and seem to be part of a sophisticated, 
versatile BER system (50, 60, 95). In particular, no 
reason has emerged explaining why hyperther- 
mophilic bacteria, which tend to have higher G+C 
content in their DNA than their archaeal counter- 
parts, should not also need this form of “read-ahead” 
proofreading. The observation that archaeal family- 
B DNA polymerases fail to discriminate effectively 
against deaminated dNTPs (40, 55) raises the possi- 
bility that the primary source of dU in archaeal DNA 
is incorporation by polymerase, rather than sponta- 
neous deamination of dC residues. Thus, the more se- 
rious fate avoided by polymerase stalling may be the 
double-strand breaks from BER of dU residues placed 
close to each other on opposite strands (35), or some 
other problem yet to be identified. 


GENETIC RECOMBINATION 


In its most general sense, genetic recombination 
means the creation of new DNA sequences from ex- 
isting sequences by processes involving strand ex- 
change rather than error-prone synthesis. This defin- 
ition accommodates recent progress in bacterial and 
yeast systems that has blurred historical distinctions 
among replication, repair, and recombination (51, 
112), and encompasses a diversity of molecular events 
that all have important effects on the stability and 
evolution of microbial genomes. Under this broad de- 
finition, four types of recombination can be distin- 
guished, according to the role played by the DNA se- 
quences involved (Table 3). General, or homologous, 
recombination (HR) requires significant intervals 
(250 nt) of identical sequence to effect strand ex- 
change, but places few other constraints on the se- 
quence. This reflects its mechanism, in which partner 
recognition is mediated primarily by the DNA se- 
quences. In contrast, site-specific recombination re- 
quires a specialized enzyme to bind a specific, short 
DNA sequence in both partners and catalyze strand 
exchange between these short sequences. Transposi- 
tion resembles site-specific recombination in requiring 
protein recognition of one DNA partner in the ex- 
change (i.e., the ends of transposable element), but 
the other partner (the target sequence) is typically 
chosen with little or no sequence specificity. Finally, 
“illegitimate” recombination breaks and joins DNA 
sequences with negligible influence of the sequences 
involved. Some situations make it difficult to distin- 
guish illegitimate and homologous recombination, as 
recombinase-independent break-and-join events may 
be facilitated by very short matching sequences (mi- 
crohomologies) (12). 


Table 3. Distinct types of genetic recombination 


Type, subtype Defining properties 


Genomic consequences 


Archaeal examples* 


Homologous Requires extensive sequence identity 
Reciprocal All markers preserved in the products 
Nonreciprocal Marker(s) eliminated 


Site-specific Integrase/excisionase operates on 
a specific sequence within both 


partner DNAs 


Transposition? Transposase operates on a specific 
DNA sequence at the TE boundaries 
Illegitimate DNAs broken and rejoined essentially 


at random 


Crossovers 
Gene conversion 


Virus/plasmid integration 
and excision 


Relocation of TE 
Large rearrangements, 


error-prone repair of 
double-strand breaks 


Targeted integration of plasmid constructs 
Intragenic recombination in 

S. acidocaldarius (hypothesized) 
SSV1, conjugal plasmids of Sulfolobus 


Insertion sequences, MITEs 


None identified 


“See text for evidence and relevant literature. 
Also classified as “nonconservative site-specific” recombination. 
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Although HR has genetic consequences, in par- 
ticular, in conjunction with DNA transfer (discussed 
below in “DNA transfer”), it appears to serve pri- 
marily as part of a system that reassembles replication 
forks that have stalled or fallen apart (51, 112). How- 
ever, even a strict conservation of this role across all 
three domains leaves room for significant differences 
in the genetic properties of HR. In addition, archaeal 
recombinases (RadA proteins) differ structurally from 
bacterial RecA proteins, and less so from eucaryal 
Rad51 proteins (92). Furthermore, some archaea en- 
code RadA paralogs, represented by the RadB pro- 
teins of Haloferax, Thermococcus, and Pyrococcus 
species (1, 49). The Pyrococcus RadB protein has 
been reported to interact with a DNA polymerase 
subunit, a putative Holliday junction resolvase, and 
RadA, depending on parameters such as ATP concen- 
tration (49). H. volcanii RadB is not essential for HR 
(1) and thus appears to play a yet-undefined role in 
DNA metabolism. 

Consistent with the apparent ubiquity of RadA 
orthologs in archaea, HR has been documented in di- 
verse archaeal species. For example, the generation of 
selectable phenotypes from genetically marked strains 
(auxotrophs and resistance mutants) has been demon- 
strated with H. volcanii, M. voltae, M. thermauto- 
trophicus, and S. acidocaldarius (4, 34, 71, 125). 
Since the corresponding strains lack episomes, these 
results provide strong presumptive evidence for 
marker incorporation by homologous recombination. 
Other studies have incorporated artificial DNA con- 
structs into archaeal genomes via homologous re- 
combination (see Chapter 21). 

Functional properties capable of distinguishing 
archaeal HR from HR in classic bacterial and eu- 
caryal systems include (i) the way in which substrate 
(homology) length affects recombination efficiency, 
(ii) the relative frequency of reciprocal versus non- 
reciprocal events, and (iii) the effect of occasional 
mismatches on recombination frequency. With re- 
spect to (i), the efficiency of reciprocal recombination 
(i.e., the yield of crossovers) relates in a characteris- 
tic way to the distance between the markers. Below a 
certain threshold, called the “minimal efficient pro- 
cessing segment,” or MEPS, recombination is ex- 
tremely rare and often independent of RecA/Rad51 
function (106). At larger distances, the frequency of 
recombinants increases in proportion to the effective 
cross section of the detector (i.e., the interval between 
the two markers). At still greater distances, multiple 
crossovers between markers become increasingly fre- 
quent, diminishing the increase in recombinants and 
leading eventually to a plateau. This frequency/dis- 
tance pattern of reciprocal HR occurs in diverse sys- 
tems, although it exhibits quantitative differences 


among them. Thus, MEPSs observed in bacteriophage, 
bacteria, and eucaryal cells are about 40, 70, and 250 nt, 
respectively, whereas the midpoint of the proportional- 
increase region corresponds to roughly 2,000, 20,000, 
and 30,000 nt, respectively (97, 106, 110). 

In contrast, HR between pyrE mutations in the 
S. acidocaldarius chromosome, initiated by conjuga- 
tion, does not fit this consensus behavior. Frequency- 
versus-distance data have provided no statistical sup- 
port for an MEPS, and the region of proportionality 
extends to only about 50 nt, beyond which recombi- 
nation was found to be fairly constant as a function 
of marker separation (39). Although an effect of the 
(yet-undefined) DNA transfer mechanism cannot be 
excluded, the same frequency-versus-distance behav- 
ior was observed with over 50 combinations of mu- 
tations located at various points in pyrE and was not 
altered by eightfold stimulation of recombination by 
prior UV irradiation (39); it thus seems to be a ro- 
bust property of the mode of HR that accompanies 
conjugation in this species. The most similar fre- 
quency-versus-distance behavior identified so far in 
other genetic systems is that of a strand-annealing 
pathway of double-strand break repair in S. cere- 
visiae (114). Based on this and other information, the 
S. acidocaldarius results seem easiest to reconcile with 
nonreciprocal recombination in which base-pairing 
allows relatively short patches of single-stranded 
donor DNA to simply replace corresponding ssDNA 
segments of the recipient (39). A similar mechanism 
may also explain the relative efficiency with which 
linear DNAs transfer markers to the S. acidocaldarius 
chromosome and the fact that this occurs even when 
the homologous sequence is very short (53). 

Reciprocal recombination has been demonstrated 
in other archaea by the integration of nonreplicating 
circular DNAs into the chromosome, usually in the 
course of allele replacement (see Chapter 21). In 
Methanococcus voltae, the heterozygotic intermediate 
formed by the first crossover was readily detected in 
transformant clones and found to be stable in the ab- 
sence of selection for at least 40 generations following 
initial selection (27). Efforts to detect such intermedi- 
ates were not reported in similar manipulations of two 
hyperthermophilic archaea (96, 126), and were unsuc- 
cessful in S. acidocaldarius, despite the use of PCR af- 
ter only six generations of growth (53). Thus, it re- 
mains unclear whether outcomes of recombination 
between circular DNAs and host genomes differ 
among the major groups of archaea. 

Examples of site-specific recombination in ar- 
chaea include viruses that integrate into the host 
chromosome at specific sites. The best-studied exam- 
ple, SSV1, integrates into the Sulfolobus shibatae 
genome by site-specific recombination at a tRNA®"® 
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gene. This interrupts the integrase gene of the virus, 
which presumably stabilizes the integrated provirus 
(87). Purified SSV1 integrase catalyzes both forward 
and reverse recombination reactions in vitro (72). Un- 
der these conditions, the site-specific recombination 
requires a minimal sequence of only 19 bp, corre- 
sponding to the distal portion of the anticodon stem 
and loop of the tRNA (103). Similar features have 
been found in integration of certain bacteriophage 
(87), and more recently, in conjugative plasmids in- 
tegrated into Sulfolobus genomes. Unlike the integra- 
tion of SSV1, these latter events do not disrupt the 
integrase gene itself (104) 

Several examples of IS transposition have also 
been documented in archaea. Most of these IS have 
relatives in bacteria and eucarya, and the correspond- 
ing IS families represent diverse molecular strategies 
of transposition (62). According to the data available, 
ISs in halophilic archaea seem unusual with regard 
to the high frequency with which they transpose and 
delete adjacent sequences (81, 82, 93). On the other 
hand, the archaeal ISs that have been characterized 
appear to resemble their bacterial counterparts in 
transposing preferentially into the 5'-untranslated 
(i.e., promoter) regions of target genes (5, 65, 81). 

Illegitimate recombination (as suggested by its 
name), is rare and difficult to detect specifically by ge- 
netic assays; as a result, it has not been analyzed sys- 
tematically in any archaeon. In eucarya and bacteria, 
illegitimate recombination often results from the ac- 
tivity of type II DNA topoisomerases and from non- 
homologous end-joining of double-strand breaks (17, 
41). In S. acidocaldarius, analysis of sequences at the 
end points of large spontaneous deletions could not 
support or exclude either possibility (37). 


DNA Transfer 


The mechanisms that enable cells to receive 
DNA from the environment, defective virons, or 
other cells are in most respects independent of ho- 
mologous recombination and arguably less important 
for maintaining structural integrity of the genome. 
However, in combination with HR or other mecha- 
nisms that allow the sequence to persist in the recipi- 
ent, DNA transfer provides microorganisms with the 
opportunity to acquire ecologically useful alleles and 
genes. According to theoretical models, this source 
of genetic variation greatly accelerates the evolution- 
ary adaptation of those lineages, relative to lineages 
that lack DNA transfer (8). The three modes of DNA 
transfer known in bacteria (transformation, transac- 
tion, and conjugation) also operate in archaea. How- 
ever, similar to bacteria, the events are rare and de- 
tecting them experimentally requires genetic selections, 


which are not available for many archaea. Natural 
transformation (i.e., uptake of naked DNA under 
conditions that would be considered physiologically 
normal) has so far been reported only for M. voltae 
and occurred at low frequency (4). Documented ex- 
amples of transduction and conjugation are not dra- 
matically more numerous, but they occur in diverse 
archaeal clades and exhibit certain interesting differ- 
ences from the classical systems. 


Transduction 


Two rather different examples of DNA transfer 
by virion-like particles have been observed in metha- 
nogens. In the first example, a lytic virus (phage), 
WM1, of M. thermautotrophicus has been shown to 
transduce chromosomal genetic markers of its host 
at frequencies of 1076 to 1074 recombinants per par- 
ticle (70). In the second example, cells of M. voltae 
shed small icosahedral particles that resemble bacte- 
riophage and contain small fragments of host DNA 
(19). Incubation of these particles with suitably 
marked recipients generates recombinants at higher 
frequencies than achieved by transformation of this 
species. This “voltae transfer agent” (VTA) loses its 
activity relatively rapidly, which has complicated its 
analysis (3). Most features, including the absence of 
any viral DNA or cell killing associated with the par- 
ticles, argue that VTA is not a classical transducing 
phage (virus) (3), although similar transfer phenomena 
have been described in a few bacteria, most notably 
Rhodobacter capsulatus (54). 


Conjugation 


Haloferax volcanii was the first archaeon to 
demonstrate cell-to-cell transfer of DNA, by generat- 
ing prototrophic recombinants from mixtures of two 
stable, genetically distinct auxotrophs. The transfer of 
chromosomal DNA required some stabilization of 
cell-cell contacts (71), which is also seen with some 
bacterial and eucaryal conjugation systems (121). Un- 
like bacterial systems, however, it did not require ge- 
netically distinct donor and recipient strains (71). A 
point of origin and direction of transfer have not been 
tested in this or other archaeal conjugation systems, 
because of the difficulty of producing multiply marked 
strains and other experimental limitations. H. vol- 
canii can also transmit certain plasmids to Haloferax 
mediterranei, but transfer of the same plasmids in the 
reverse direction, or transfer from H. volcanii to 
other species, could not be detected (116). It thus re- 
mains unclear whether successful transfer requires 
certain functions missing, in particular, host/plasmid 
combinations or is constrained by other aspects of the 
donor and recipient. 
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Conjugational transfer of a plasmid between Sul- 
folobus species was first observed for a large plasmid, 
pNOB8, by coculture of its native host with an excess 
of recipient cells having a distinct genomic restriction 
pattern (100). A number of such self-transmissible 
plasmids have since been found in isolates of S. solfa- 
taricus, S. “islandicus,” and related species. The mode 
of transmission appears to be analogous to that of 
classical bacterial conjugation, in the sense that it in- 
volves a plasmid-containing donor and a plasmid-free 
recipient (8). The Sulfolobus conjugative plasmids are 
rather diverse, and only a few genes can be implicated 
in plasmid replication and cell-to-cell transfer by the 
criterion of conservation among these plasmids. This 
suggests that these archaeal plasmids use simpler 
transmission mechanisms than many bacterial con- 
jugative plasmids do (33). 

The remaining example of archaeal conjugation, 
found in S. acidocaldarius, resembles the H. volcanii 
system in that chromosomal markers are transferred 
and the process involves no free plasmid and no ge- 
netic distinction between donor and recipient strain, 
aside from the chromosomal markers used to select 
recombinants (34). The effects of various manipula- 
tions on the efficiency of the S. acidocaldarius sys- 
tem indicate that conjugation can initiate rapidly 
upon cell mixing and can occur efficiently in liquid 
medium (i.e., without stabilization of cell-cell con- 
tact by adsorption to a surface) (28). Genetic proper- 
ties suggest that much of the donor DNA is ultimately 
incorporated into the recipient as relatively small 
fragments. Specifically, (i) unselected markers exhib- 
ited negligible genetic linkage to selected markers in 
three-factor crosses, even when separated by only 500 
to 600 bp, and (ii) when one parental strain was 
forced to serve as the donor by moderate gamma ir- 
radiation, strains bearing large deletions were very 
poor recipients (39). 


Gene Flow in Natural Populations 


Historically, genetic analysis of unicellular or- 
ganisms has involved designating a particular isolate 
(clone) as “wild type” and using it to represent the 
species. This focus on a specific clone provides for de- 
fined genotypes, which are very useful in experimen- 
tal genetics, but it does not address the extensive ge- 
netic variation that occurs in natural populations of 
microorganisms. Conversely, the historic focus of 
classical population genetics on plant and animal sys- 
tems does not address the dynamics unique to hap- 
loid, unicellular organisms, or their phenotypically 
cryptic variation (58). Fortunately, study of microbial 
populations has gained both sophistication and mo- 
mentum with the advent of large-scale DNA sequenc- 


ing, and some of the recent analyses of microbial pop- 
ulations have examined archaea. 

Animal and plant speciation has long been un- 
derstood in terms of geographic separation of popu- 
lations, leading to restricted gene (allele) flow between 
populations, reproductive isolation, and relatively un- 
restrained genetic divergence (88). Geographical iso- 
lation has been considered irrelevant for microorgan- 
isms because their large numbers and small size favor 
efficient, long-range dispersal and globally homoge- 
nous populations (i.e., “everything is everywhere, the 
environment selects”) (22). This view is reinforced for 
bacteria and archaea by many studies that employ 
only low-resolution phylotypes (e.g., 16S rRNA se- 
quences), which typically fail to reveal geographic 
patterning. Higher-resolution genetic analysis also in- 
dicates a lack of population structure, or “biogeog- 
raphy,” of some bacterial species that have cos- 
mopolitan habitats and facile dispersal (90). Thus, 
the archaea (and bacteria) that lack specific survival 
forms and populate “extreme” environments provide 
a strategic test of the generality of the “no biogeog- 
raphy” assumption. This reflects the fact that the an- 
thropocentric designation “extremophile” identifies 
microorganisms for which the expansive world of 
“normal” habitats cannot support growth and repro- 
duction. Accordingly, the dispersal of viable cells (mi- 
gration), and the resulting sharing of alleles among 
populations should be inefficient and potentially out- 
paced by genetic changes unique to each locale (i.e., 
the creation of new alleles through mutation and the 
loss of alleles through selection and genetic drift) (Fig. 
1). If so, each local population will accumulate its 
own unique alleles, reflecting its own unique history, 
and the overall genetic divergence between popula- 
tions will parallel the physical distance between them, 
reflecting the dominant role played by the geographi- 
cal separation as a barrier to migration. 

This predicted genetic impact of restricted mi- 
gration has been confirmed by multilocus sequence 
typing (MLST) for populations of S. “islandicus” dis- 
tributed throughout the Northern Hemisphere. Sev- 
enty-eight individuals (clones) isolated from acidic 
hot springs and solfataras were first determined to be 
conspecific by the criterion of greater than about 
99% nucleotide identity of nine protein-encoding, 
“housekeeping” genes (119). These nucleotide differ- 
ences were then analyzed phylogenetically and found 
to define five major clades within this species. Each 
clade corresponded to a distinct geographical region: 
the central Kamchatka peninsula, the southern Kam- 
chatka peninsula, Iceland, Lassen Volcanic National 
Park (California), and Yellowstone National Park 
(Wyoming). The total genetic divergence between 
pairs of individuals increased with the distance be- 
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tween their sites of origin and did not correlate with 
differences of temperature, pH, or regional geology. 
Thus, the genetic divergence of local Sulfolobus pop- 
ulations was attributed primarily to restricted disper- 
sal, rather than locale-specific selection (119). This 
situation may extend to other microorganisms that 
require special conditions for growth and survival. A 
similar analysis has, for example, indicated genetic 
isolation of at least one population of Pyrococcus 
furiosus (21). 

Limited migration elevates the importance of 
mutation and recombination in the evolution of nat- 
ural populations (Fig. 1). The relative rates of muta- 
tion and recombination can be assessed by statistical 
analysis of sequence differences (polymorphisms) 
among closely related individuals; this has been done 
in at least three divergent archaea, with generally con- 
gruent results. In the first study, the natural abun- 
dance of a Ferroplasma species in an acid mine 
biofilm enabled its population structure to be char- 
acterized by sequencing of DNAs cloned directly 
from the biofilm (118). The genomic segments re- 
vealed a relatively limited number of polymorphisms 
within the population, but many different combina- 
tions of these polymorphisms. The resulting geno- 
typic diversity of the Ferroplasma species greatly 
exceeded that of the dominant bacterium in the com- 
munity, a Leptospirillum species (118). In a second 
study, isolates of various Halorubrum species culti- 
vated from Spanish salterns exhibited low levels of 
linkage disequilibrium, as assessed by MLST, indicat- 
ing frequent recombination among lineages (76). In 
a third study, detailed MLST analysis of S. “islandi- 
cus” isolates identified 16 distinct six-locus genotypes 
among only 60 isolates recovered from one geother- 
mal site (120). The phylogenies (evolutionary histo- 
ries) of the six loci disagreed, and for any two geno- 
types, differences were much more common between 
different loci (genes) than within a locus. According 
to three different statistical measures, a typical allele 
in this population was several times more likely to 
have been created by a recombination event than by a 
mutational event (120). 

All three studies indicate extensive shuffling of 
DNA sequences among closely related archaeal lin- 
eages. Such prevalence of recombination over muta- 
tion is not unique to archaea; certain bacteria, notably 
naturally competent pathogens, exhibit a similar pop- 
ulation structure, for example (113). However, most 
bacteria exhibit a highly clonal population structure 
in which such recombination is rare, and this has also 
been reported for bacteria in “extreme” biotopes, 
such as geothermal springs (79). In contrast, no ar- 
chaea have so far been confirmed to exhibit clonal 
population structures, despite considerable phyloge- 


netic and ecological diversity of the few species that 
have been examined. 


PERSPECTIVE: THE NEXT FIVE YEARS 


Measuring the Functional Genetic Properties 
of Archaea 


The biological differences separating the Archaea 
from the Bacteria on the one hand, and from the Eu- 
carya on the other hand, are too great to allow the 
nature of genetic processes in archaea to be predicted 
by analogy to either of the other domains. In fact, it 
seems likely that major groups of archaea differ 
among themselves with respect to genetic properties, 
as suggested by the unequal distribution of putative 
DNA repair genes. As a result, genetic properties of 
archaea must continue to be measured experimen- 
tally, which demands the development and validation 
of appropriate quantitative assays. This effort has 
only begun, making comparisons among archaea, 
bacteria, and eucarya highly asymmetric and, thus, 
tenuous. In this context, the hypothesis that archaea, 
or major archaeal groups, have their own distinctive 
set of genetic properties is important, not because of 
its existing support, but because it focuses attention 
on an area of strategic importance for elucidating and 
exploiting the molecular and cellular biology of ar- 
chaea. Testing this hypothesis will involve measuring 
molecular processes central to the survival, reproduc- 
tion, and evolution of archaea and to the develop- 
ment of experimental tools for establishing gene func- 
tion at the molecular level. It thus seems appropriate 
to identify for special scrutiny some of the character- 
istics that seem to be emerging as distinctive properties 
of archaea, even though the experimental data remain 
limited. These properties include (i) genetic conditions 
or states of archaea, (ii) genetic mechanisms that lead 
to such states, and (iii) unusual features of certain ar- 
chaeal enzymes presumed to have genetic impacts. 


How Do Hyperthermophilic Archaea 
Avoid Mutation? 


Sulfolobus spp. have well-defined molecular 
mechanisms, in the form of error-prone DNA poly- 
merases, for converting DNA lesions into mutations 
(6). The low error rates these archaea achieve during 
normal growth therefore imply other, highly accurate 
DNA replication and repair mechanisms that com- 
pete effectively with the error-prone processes, and 
thereby convert lesions into intact DNA. The latter 
mechanisms remain mysterious, however, because of 
the lack of identifiable damage-recognition proteins 
that initiate MMR or NER. The availability of com- 
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plete genome sequences and sensitive mass-spectro- 
metric techniques may allow proteins that bind 
specifically to DNA mismatches or UV photoprod- 
ucts to be unambiguously identified in these archaea. 
It will also be important to develop assays in vivo that 
investigate the functional roles of the Rad1/XPF and 
Rad2/XPG homologs of archaea. Do these proteins 
complete the process of NER in hyperthermophilic 
archaea initiated by other proteins? Does their role 
differ in the archaea that encode a complete set of 
UvrABC homologs? 

Archaeal DNA synthesis exhibits interesting 
molecular features (see Chapter 3), including some 
relevant to genome stability. For example, hyperther- 
mophilic archaea synthesize very short Okazaki frag- 
ments (66), implying a highly discontinuous lagging 
strand, whereas discontinuity of the leading strand is 
suggested by the active uracil N-glycosylases of these 
archaea, combined with poor discrimination of their 
family-B DNA polymerases against dUTP (55). (The 
latter property, in combination with “read-ahead 
uracil proofreading,” limits the effectiveness of ar- 
chaeal polymerases in PCR [40]). Are both leading 
and lagging strands discontinuous in these archaea? If 
so, does this discontinuity mark newly synthesized 
DNA for important molecular processes? Though 
speculative, this question is significant because it rep- 
resents a potential basis for the strand discrimination 
required by postreplicational MMR, whether of a 
conventional or novel type. Discontinuity has been 
proposed to be the primary mechanism by which con- 
ventional MMR discriminates daughter from tem- 
plate strands in eucarya and most bacteria (11), and 
alternative MMR systems can be imagined which 
could also exploit daughter-strand discontinuity. For 
example, if two helicases of opposite polarity were re- 
cruited to a mismatch (in a manner analogous to eu- 
caryal NER), translation along the DNA in both di- 
rections would ultimately displace the erroneous 
(daughter) strand from the correct (template) strand 
and prepare the region for accurate resynthesis (35). 
It may soon be possible to test certain predictions of 
these hypotheses, either through genetic manipulation 
of uracil DNA glycosylase and dUTPase levels in 
vivo, or through assays in vitro with cell extracts and 
model substrates. 


How Does Homologous Recombination Affect 
Genome Stability? 


HR has been used to disrupt specific genes in 
many archaea, but it has rarely been analyzed in ar- 
chaea as a genetic process. It thus remains unclear 
how the recombination mediated in vivo by archaeal 
RadA proteins compares with that initiated by RecA 


and Rad51 proteins in bacteria and eucarya, respec- 
tively. Furthermore, the limited data on properties of 
archaeal HR seem to differ even within a single genus. 
Recombination of exogenous DNA with the host 
genome is reported to be virtually inactive in S. sol- 
fataricus strain P1, moderately active in a related Sul- 
folobus isolate, and highly active in S. acidocaldarius 
(39, 43, 126). HR between chromosomal mutations 
in S. acidocaldarius conjugation assays is surprisingly 
efficient over very short distances, which seems to de- 
mand a nonreciprocal mechanism (39). It will be im- 
portant to determine whether this mode of recombi- 
nation is also important over longer distances and in 
contexts outside of conjugation. 

A mode of HR that operates on short regions of 
sequence identity may offer advantages for repairing 
damaged genomes (14), but it also threatens genomic 
stability during normal growth by promoting inap- 
propriate recombination between nearly identical se- 
quences in different regions of the genome. The dele- 
terious consequences of such “ectopic” recombination, 
including gene disruption and gross rearrangements, 
provide a rationale for the second major function of 
known MMR proteins, namely, inhibition of HR be- 
tween sequences containing a few nucleotide differ- 
ences (115). Many of the hyperthermophilic archaea 
that lack identifiable MMR proteins also have low 
G+C genomes and high purine content in the non- 
transcribed strand of genes (77). These compositional 
biases limit DNA sequence complexity and thereby 
tend to create substrates for ectopic recombination. It 
will thus be important to determine how HR in hyper- 
thermophilic archaea responds to minor sequence di- 
vergence and how it compares with HR in mesophilic 
archaea. If differences are observed, it will also be im- 
portant to investigate the role of the “conventional” 
MMR homologs encoded in the mesophilic archaea 
in these differences. 


Are Archaea Adapted for Adaptation? 


Compared with bacteria and unicellular eucarya, 
archaea that have been analyzed in genetic terms have 
rather low rates of neutral mutation (which ignores 
the activity of TEs) and high rates of recombination. 
This combination of properties would seem ideal for 
evolutionary adaptation. Recombination of existing, 
functional sequences is predicted to be a much more 
efficient strategy for improving fitness than mutation 
de novo, reflecting the fact that very few mutations are 
beneficial and many are disadvantageous. Superiority 
of the recombinational strategy also has experimental 
support (30). It will be important, however, to test this 
pattern of low-mutation and high-recombination rates 
across a wider range of archaeal taxa, environments, 
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and life-history traits than has been possible in the 
past. Because the relative rates of mutation and re- 
combination can be inferred from statistical analysis 
of sequence polymorphisms, sequencing DNA of 
abundant but uncultivated archaea cloned directly 
from “moderate” habitats, such as marine sponges, 
pelagic plankton, or the rhizosphere of terrestrial plants 
(2, 99, 109) may fill in important gaps in our knowl- 
edge about the relative roles of mutation and recom- 
bination in archaea. Ultimately, combining computa- 
tional genetic analyses of natural populations with 
experimental analyses of cultured archaea will help 
clarify how molecular mechanisms of archaea deter- 
mine genetic properties and how genetic properties of 
archaea affect genome stability and evolution. 
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Chapter 6 


Transcription: Mechanism and Regulation 


MICHAEL THOMM 


INTRODUCTION 


The biochemical machinery involved in the 
processes of DNA replication (see Chapter 3), tran- 
scription, and translation (see Chapter 8) shows a 
striking similarity and phylogenetic relationship to 
the equivalent machinery in eucarya (12, 38, 72, 99, 
102). In particular, RNA polymerase (RNAP) and the 
basal transcriptional machinery of archaea share 
many properties with the eucaryal RNA polymerase 
II (RNAP II) transcription apparatus (12, 83, 99). 

The first step in the initiation process is the 
recognition of an AT-rich promoter element (TATA 
box) by the archaeal TATA-binding protein (TBP) 
(Fig. 1; references 87, 98). The archaeal TATA box is 
located ~25 to 30 bp upstream of the transcription 
start site, (37, 45, 90). TBP is highly conserved in se- 
quence and function (12, 94), and human and yeast 
TBPs can functionally replace archaeal TBP in a 
mesophilic archaeal cell-free transcription system 
(107). The TBP-promoter complex is bound by TFB, 
the second archaeal factor, which is closely related in 
structure and function to eucaryal TFIIB (Fig. 1; ref- 
erences 12, 47). Transcription polarity is governed by 
the interaction of promoter-bound TFB with a con- 
served motif immediately upstream of the TATA box, 
the B recognition element (BRE) (14, 17). This ternary 
complex recruits the archaeal RNAP, which is bound 
to the DNA region downstream of the TATA box. 
The DNA-binding site of the RNAP extends 3’ from 
the TATA box to position +18 (Fig. 1; reference 96). 
All complete archaeal genome sequences contain a 
homolog of the a-subunit of the eucaryal general 
transcription factor TFIIE. Archaeal TFE is not ab- 
solutely required for cell-free transcription but stim- 
ulates transcription of some promoters under certain 
conditions (9, 41). Homologs of the TATA-box-asso- 
ciated factors (TAFs) and eucaryal basal transcription 


factors TFIIA, TFIIH, and TFIIF have not been de- 
tected in archaeal genomes, although the existence of 
a TBP interaction protein TIP26 (71) raises the pos- 
sibility that the machinery directing transcription in 
archaea is more complex. In general, the archaeal 
transcriptional machinery can be considered a simpli- 
fied version, and the evolutionary precursor of the 
more complex eucaryal machinery. Thus, the archaeal 
system is a useful model for investigating the mecha- 
nism and structure of the eucaryal transcriptional 
machinery. 

The archaeal RNAP has a structure resembling 
eucaryal RNA polymerases. The subunit complexity 
and sequence of individual subunits is highly con- 
served between eucarya and archaea. The small sub- 
units E, H, K, L, and N have homologs in eucaryal 
enzymes but not in the Escherichia coli RNAP (13, 
60, 105). The RNAP from Methanocaldococcus jan- 
naschii has been assembled in vitro from individual 
subunits. The complex formed by the subunits D, L, 
N, and P serves as a scaffold for the association of the 
subunits A’, A” and B’, B” forming the active site of 
the enzyme (105). 

In contrast to the basal machinery, most puta- 
tive transcriptional regulators identified in archaeal 
genomes (1, 36, 56) are homologs of bacterial pro- 
teins carrying helix-turn-helix motifs (11, 20). Regu- 
lators of archaeal transcription repress initiation by 
preventing TFB/TBP access to the TATA-box region 
(11, 74) or RNAP recruitment to the transcription 
start site (103). Two regulators, TrmB and TrpY (see 
“The sugar transport regulator, TrmB,” and “Regula- 
tion of the tryptophan operon by TrpY,” below), block 
access of TBP/TFB or RNA polymerase in a promoter- 
dependent manner (61, 62, 110). The most widely rep- 
resented archaeal regulators in archaeal genomes are 
members of the Lrp/AsnC family (20). Similar to the 
global regulator Lrp (leucine-responsive-regulatory 
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Figure 1. Initiation of transcription in archaea. The first step of promoter recognition is binding of TBP to the archaeal TATA 
box. This complex is stabilized by the association of TFB. Bound TFB interacts with the purine-rich BRE sequence 5' of the 
TATA box. This complex recruits the RNA polymerase that binds to the DNA region downstream of the TATA box and cov- 


ers the transcription start site and the DNA downstream region to position +18. 


protein) of E. coli, they appear to function as repres- 
sors (19, 32, 74) and activators of transcription (77, 
79). Several archaeal regulators with no homologs to 
bacterial or eucaryal regulators, including GvpE (54), 
Phr (103), and TrmB (61), have been identified and 
characterized in some detail. 

Due to the lack of tractable genetic systems for 
most archaeal genera (see Chapter 21), the physio- 
logical function of many transcriptional regulators re- 
mains obscure. The development of whole-genome 
microarrays (92, 93) coupled with cell-free transcrip- 
tion experiments using fragmented chromosomal 
DNA as template may provide a useful tool for the 
elucidation of the set of genes regulated by global ar- 
chaeal regulators such as Phr, TrmB, and archaeal ho- 
mologs of the LrpA/AsnC family. 


THE TRANSCRIPTIONAL MACHINERY 


Transcription Signals 


A TATA box at position —25 was identified as 
the first structural determinant of archaeal promot- 
ers both in the Euryarchaeota (98) and the Crenar- 
chaeota (87). Mutational analyses of the significance 
of this sequence in cell-free transcription systems for 
Methanococcus (33) and Sulfolobus (49) confirmed 
the significance of this sequence as a promoter signal 
(40, 42, 86). In Haloferax volcanii the function of the 
TATA box as a major promoter signal was demon- 
strated in vivo (82). Further work revealed some vari- 
ation of the TATA-box sequence among various 
genera of archaea (29, 94, 101). However, the involve- 
ment of the TATA box as a major determinant of ar- 
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chaeal promoters, and the general mechanism of ini- 
tiation (see “Transcription factors” and “The mech- 
anism of transcription,” below), seems to be conserved 
in the Archaea. A second conserved promoter signal 
immediately upstream of the archaeal TATA box, the 
BRE, is the principal determinant for the orientation 
of transcription (14). This purine-rich sequence (con- 
sensus RNWAAW; R = purine; W = A or T; N = any 
base) interacts with promoter-bound TFB, and this 
interaction defines transcriptional polarity. 

Much less is known about the signals directing 
termination of transcription. Oligo-dT sequences 
downstream of archaeal genes were proposed as can- 
didate sequences for archaeal terminators (21, 87, 
108). One study using mutagenesis of the sequence 
5'-TTTTAATT-3’ provided evidence for its signifi- 
cance as a terminator signal in the Methanococcus 
tRNA! gene (97). However, in contrast to the eu- 
caryal polll system, where four consecutive T residues 
act as a terminator signal (35), five T residues were 
not sufficient for efficient termination of the tRNA¥*! 
gene, and sequences encoding the tRNA were clearly 
required for efficient termination in vitro. In contrast 
to the Methanococcus cell-free system, the Sulfolobus 
and Pyrococcus cell-free systems are unable to termi- 
nate transcription accurately, suggesting that addi- 
tional components are required for efficient termina- 
tion. The process of termination of transcription is 
not well understood. Further work is required to de- 
termine the specific sequences involved in termination 
and the factors catalyzing the termination process. 


Transcription Factors 


The inability of purified archaeal RNAP to initi- 
ate transcription accurately in vitro provided the first 
evidence for the existence of archaeal transcription 
factors. The first reconstituted transcription systems 
were strictly dependent on the addition of protein 
fractions devoid of RNAP activity (33, 49). A dimeric 
transcription factor from Methanococcus was purified 
to homogeneity (44). This factor could be replaced in 
cell-free transcription experiments by eucaryal TBPs 
(107) and was shown to bind to the archaeal TATA 
box (37). Sequence analysis of archaeal genomes re- 
vealed the presence of homologs of eucaryal TBP and 
TFB (69, 77, 90), and heterologous expression of 
these proteins and analysis of their function in Pyro- 
coccus and Sulfolobus transcription systems revealed 
that these genes encoded homologs of the eucaryal 
transcription factors TBP and TFIIB (47, 84). Thus, a 
minimal archaeal transcription system is defined by 
the factors TBP, TFB, and RNAP (Fig. 1). 

Archaeal TBPs consist of two directly repeated 
protein domains. These two copies of the repeat show 


a high degree (~40%) of amino acid identity. Ar- 
chaeal TBPs lack the N-terminal domain characteris- 
tic of eucaryal TBPs and possess a highly acidic C-ter- 
minal tail (reviewed in reference 12). Analysis of the 
crystal structure of Pyrococcus TBP also revealed 
striking similarity to eucaryal counterparts. Pyrococ- 
cus TBP is a saddle-shaped molecule. The DNA-bind- 
ing part of the molecule is located on the underside of 
the saddle (30). 

The structural organization of TFB is similar to 
that of eucaryal TFIIB. TFB consists of three domains, 
an N-terminal domain harboring a zinc-ribbon and a 
B finger (22, 81), and a C-terminal region comprising 
two direct repeats of about 90 amino acids, including 
a helix-turn-helix motif close to the C terminus. The 
structure of the N-terminal domain has been solved 
(112). The metal-binding motif is highly conserved 
throughout the Archaea and Eucarya (95). Deletion 
of the N-terminal domain did not prevent formation 
of the TBP-TFB promoter complex because N-terminal 
truncated versions of TFB were used for structural 
analyses of promoter-bound TFB/TBP, both in the eu- 
caryal and the archaeal system (53, 75). Mutation of 
a conserved amino acid close to the zinc ribbon inac- 
tivated TFB-dependent transcription at some promot- 
ers. This TFB mutant was still able to form a TBP- 
TFB-promoter complex but had lost the ability to 
recruit the RNAP (14). Two-hybrid analyses suggest 
an interaction of the N-terminal domain with subunit 
K of RNAP (68). The finding that reconstituted RNAP 
from M. jannaschii lacking subunit K is still able to 
transcribe promoters in vitro (105) indicates that 
subunit K is not absolutely required for RNAP re- 
cruitment. More components of the RNA poly- 
merase, and probably also other domains of TFB, 
seem to be involved in RNAP-TFB interaction. Struc- 
tural analyses of the yeast TFIIB-RNAPII complex re- 
vealed that the N-terminal domain of TFIIB lies in the 
RNAPII channel harboring the template strand of the 
transcription bubble and the DNA-RNA hybrid that 
forms at early stages of transcription. Site-specific 
photochemical cross-linking has shown that TFB cross- 
links to DNA upstream and downstream of the TATA 
box (7, 8, 88). In addition, TFB cross-links to a DNA 
segment at the transcription start site that is part of 
the open complex. By using recruitment-independent 
nonspecific transcription assays, a stimulatory post- 
recruitment function of M. jannaschii TFB was shown. 
The B-finger domain of TFB was essential for the ob- 
served stimulation of abortive initiation (106). 

All the available structural evidence is consistent 
with the archaeal TFB interacting with DNA and 
RNA polymerase in a manner similar to eucaryal 
TFIIB (34, 77). Taken together, the structural and cross- 
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linking data suggest that TFB has the following func- 
tions (Fig. 2). The C-terminal core domain binds to 
TBP and recognizes the BRE. The N-terminal Zn rib- 
bon interacts with the dock domain of RNAP. The 
contacts between promoter DNA and RNAP are 
weak in the closed complex. During open complex 
formation, the template strand is inserted into the ac- 
tive center of RNAP. The B finger stabilizes the tem- 
plate strand in the active site of the enzyme and facil- 
itates a DNA/rNTP configuration that favors the 
catalytic activity of the RNAP (Fig. 2; 34, 106). 
Eucaryal TFIE is a heterotetramer (af). It re- 
cruits TFIIH and is involved in the RNAP phospho- 
rylating activity of TFITH. Archaeal genomes encode 
a homolog of the N-terminal half of the a-subunit of 
TFE. The crystal structure of an N-terminal fragment 
of Sulfolobus TFE has been solved (73). It contains a 
winged helix-turn-helix structure. A modest stimula- 
tory effect of archaeal TFE on in vitro transcription of 
some weak promoters was shown (9, 41). The mod- 
erate level of stimulation of transcription by TFE was 
confirmed using a completely purified system, con- 
sisting of recombinant TBP and TFB and a RNAP re- 
constituted from bacterially produced subunits (80). 
This finding excludes the possibility that strong stim- 
ulatory effects of TFE were underestimated owing to 
the presence of minor quantities of TFE in RNAP 
preparations. Werner and Weinzierl (106) recently 
showed that mutants of the Zn-ribbon and B-finger 
domain of TFB, which are unable to recruit RNAP, 
can be complemented in vitro by TFE. Furthermore, 
experiments using heteroduplex templates located up- 
stream and downstream of the transcription start site 
suggest a role of TFE in promoting DNA melting 
and/or template loading. Taken together these data 
suggest that both factors have an active role in influ- 
encing the catalytic properties of RNAP and act syn- 
ergistically during initiation of transcription. 


1 70 103 140 


Zn Ribbon Linker 


B Finger 


Archaea contain a cleavage induction factor, 
TFS. This protein is homologous to the C-terminal 
part of eucaryal elongation factor TFIIS and in addi- 
tion to the small subunits B12, 2, A12, 6, and C11 of 
eucaryal RNAPs I to III (43). This protein does not 
purify with archaeal RNAP. Experiments with paused 
elongation complexes showed that TFIIS induces a 
cleavage activity in the archaeal RNAP. TFS confers 
a proofreading activity to RNAP by inducing the re- 
lease of dinucleotides from the 3’ end of nascent RNA 
(59). It would be valuable to examine the role that 
TFS plays in enabling RNAP to overcome barriers to 
RNA elongation (e.g., histones or other DNA-associ- 
ated proteins), and its possible role in transcription 
termination. 


SSB 


The single-stranded DNA (ssDNA)-binding pro- 
teins of crenarchaeota have a domain organization 
similar to E. coli SSB. This includes a single domain 
with an oligonucleotide-binding fold for ssDNA bind- 
ing which is separated by a flexible spacer from an 
acidic C-terminal tail. The archaeal SSB of Sulfolobus 
interacts with RNAP via the C-terminal acidic tail 
(89). Under TBP-limiting conditions SSB stimulates 
transcription, suggesting that it acts at the level of 
RNAP recruitment or initiation. SSB also appears to 
be able to replace TBP in transcription assays. A con- 
tamination of RNAP preparations with trace amounts 
of TBP is unlikely but cannot be excluded. 

The repression of transcription observed in the 
presence of the major chromatin component of hy- 
perthermophiles, the ALBA (Sso 10b) protein, is re- 
lieved by SSB (89). The specific interaction of the 
acidic tail of SSB with RNA polymerase is critical for 
this stimulation of transcription. Localized melting of 
DNA in the AT-rich TATA-box region mediated by 
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Figure 2. Domain structure of TFB. The major structural features of TFB and their interactions with other components of the 


transcriptional machinery. 
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SSB may overcome the requirement for TBP. Analyses 
of DNA templates harboring AT-rich sequences lo- 
cated within structural genes will reveal whether the 
ability of SSB to replace TBP is promoter specific, or 
whether SSB-directed initiation can also occur within 
AT-rich open reading frames. 


RNA Polymerase 


The similarity of the multisubunit archaeal 
RNAP to eucaryal RNAP was one of the first recog- 
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nized eucaryal features of the Archaea (39, 113). Ar- 
chaeal RNAPs consist of 11 subunits, B, A’, A’, D, 
E, F, L, H, N, K, and P (Crenarchaeota, Pyrococcus; 
Fig. 3), or 12 subunits. The largest subunit, B, is split 
into two subunits in methanogens (91), yielding 12 
subunits to give the subunit composition, A’, B’, B’, 
A', D, E, F L, H, N, K, and P. The homology of 
RNAP subunits in the three domains of life is shown 
in Fig. 3. In general, the larger subunits are paralogs, 
although the sequence similarities are much more 
pronounced between Archaea and Eucarya than be- 
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Figure 3. (See the separate color insert for the color version of this illustration.) Subunit structure of RNAPs from the three 
domains of life. The largest subunit in the Eucarya and 8’ in the Bacteria is split into two subunits, A1 and A2, in the Archaea. 
In methanogens, subunit B is also split into two polypeptides, B’ and B’. Different parts of bacterial subunit a are encoded by 
the genes for the archaeal subunits D and L. Subunits E1, F, H, N, and P are only shared between the Archaea and Eucarya. The 
pattern shown is based on separation of subunits by polyacrylamide gel electrophoresis under denaturing conditions. The num- 
bers in the subunits of the eucaryal RNAP A (I), B (II), and C (III) indicate the molecular mass. 
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tween Bacteria and Archaea or Bacteria and Eucarya. 
The archaeal paralog of the bacterial subunit B’ is 
split into two polypeptides in archaea: A’ and A”. A’ 
is related to the C-proximal half of Rpb1 and B’, and 
A" is related to the N-terminal half of Rpb1 and B’. 
The subunits D and L are related to Rpb3 and a part 
of a, and L is related to Rpb11 and to a different seg- 
ment of a (Fig. 3). 

The RNAP was reconstituted from the metha- 
nogen M. jannaschii from 12 recombinant subunits 
(105). The a, dimer of bacterial RNAP serves as a 
platform for assembly of the larger subunits (50). 
Studies of Rpb3 and Rpb11 subunits suggest that the 
archaeal D-L heterodimeric complex is the structural 
and functional equivalent of the bacterial a-homo- 
dimer (31, 104). The D-L-N-P subcomplex was used 
as a platform for reconstitution of an active M. jan- 
naschii RNAP. For reconstitution of M. jannaschii 
RNAP, the subunits D-L and E-F, which are not solu- 
ble when expressed separately, were coexpressed as 
soluble heterodimers. The Pyrococcus RNAP that 
was reconstituted from 11 separately expressed sub- 
units exhibited wild-type levels of promoter-specific 
activity (Naji et al., unpublished). This finding indi- 
cates that preassembly of the D-L-N-P complex is not 
a prerequisite for the assembly of an archaeal RNAP. 


A 


The availability of the recombinant enzyme was 
used to investigate the function of essential domains 
of the enzyme and the contributions of small subunits 
to RNAP function, and to identify the minimal sub- 
unit configuration for specific RNA synthesis. The 
complex A’-A’-B’-B’-D-L was soluble but devoid of 
activity. Activity was restored by the addition of sub- 
units N and P (105). Subunit K, the paralog of bac- 
terial w and eucaryal RpB6, was not required for ac- 
tivity but did enhance the activity of the minimal 
active assembly (A’-A’-B’-B’-D-L-N-P) twofold. Two 
hybrid analyses have shown that Sulfolobus K inter- 
acts specifically with the N-terminal domain of TFB 
(68). Hence, interaction of K with the zinc ribbon of 
TFB seems to participate in recruitment of RNAP by 
the TBP-TFB-promoter complex (Fig. 4). However, 
it is not essential for that step since omitting K from 
reconstitution reactions does not completely abolish 
promoter-directed transcription. 

The zinc ribbon of eucaryal TFIIB does not bind 
to RpBé but binds to a surface pocket of the dock, 
wall, and clamp domain of RNAP II, which is formed 
by the two large subunits Rpb1 and Rpb2 (24). This 
finding suggests an interesting difference in the mole- 
cular architecture of eucaryal and archaeal RNAP. 
Subunit H is the homolog of eucaryal RpbS. RpbS 


Figure 4. (See the separate color insert for the color version of this illustration.) Structural similarity of Pyrococcus RNAP (A) 
and yeast RNAPII (B). Comparison of interactions of an archaeal RNAP inferred from Far-Western analysis with interactions 
of yeast RNAPII observed in the crystal structure of the enzyme. The width of the lines connecting subunits is a measure of 
the intensity of the interaction. Modified from Science (27) with additional data from Proceedings of the National Academy 
of Sciences USA (2). 
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and Rpb1 form the lower jaw of RNAP II and are 
likely to be involved in contacting the template DNA. 
Subunit H stimulated the promoter-specific activity of 
the archaeal enzyme up to 10-fold, indicating that it 
contributes considerably to specific activity of the re- 
constituted enzyme. The eucaryal homologs of sub- 
units E/F, Rpb4/7, are essential (34), but while E/F are 
easily incorporated into the recombinant archaeal 
RNAP, they are not essential for its activity (105). 
The large subunits of archaeal RNAP A’ and B’ con- 
tain the metal A and metal B motifs of RNAP II (27) 
involved in Mg?* chelating near the catalytic center 
of RNAP. Site-specific mutagenesis of conserved 
amino acids in these two motifs completely abolished 
or substantially reduced specific activity of the ar- 
chaeal enzyme, indicating a high degree of conserva- 
tion of the basic mechanism of transcription between 
these subunits from the Archaea and Eucarya. 

The interaction of recombinant subunits of the 
Pyrococcus RNAP have been extensively studied to 
investigate the molecular structure of archaeal RNAP 
(Naji et al., unpublished). Far Western blot analysis 
revealed strong interactions of DLN and P, the pro- 
posed platform for assembly and interactions with 
subunit B (Fig. 4A). The B-P-N-L-D complex is con- 
nected to the rest of the enzyme via B-A’, N-A’, and 
P-E’ interactions. Subunits E-F also show strong con- 
tacts and can be purified with the main complex by 
gel filtration (36a). In summary, the structural simi- 
larity of Pyrococcus RNAP to eucaryal RNAP II is 
striking (Fig. 4). However, determination of the crys- 
tal structure of the archaeal enzyme is required to 
confirm and refine the similarities and differences in- 
ferred from biochemical analyses. 


THE MECHANISM OF TRANSCRIPTION 


A comprehensive and consistent view of the 
mechanism of transcription by archaeal RNAP has 
been gained from the recently solved structures of the 
yeast RNAPII (2, 26) and the TFIIB-RNAPII complex 
(22), protein-protein cross-linking analyses in the 
eucaryal system (24, 25), protein-DNA cross-linking 
analyses in the archaeal system (7, 8, 88), analyses of 
open complex formation and paused transcription 
complexes (16, 46, 96), and analyses of reconstituted 
RNAP and archaeal TFB variants (105, 106) (Figs. 4 
and 5). Asa first step, the saddle-shaped archaeal TBP 
binds to the minor groove of an 8-bp TATA box. This 
leads to bending of DNA by approximately 65° (53). 
The process of TBP binding is stabilized and stimu- 
lated by TFE. TFB associates with the TBP-promoter 
complex whereby the C-terminal domain makes con- 
tacts with TBP and the BRE of the archaeal promoter 


and defines the polarity of archaeal transcription 
(17). The RNAP is recruited by interactions of the 
TFB zinc ribbon with the dock domain of RNA poly- 
merase and with subunit K. The contacts of RNAP 
with DNA in the closed complex are weak (Fig. 5A). 
In the open complex, the template strand comes into 
contact with the active site and is stabilized by the B 
finger of TFB (Fig. 5B). TFE stabilizes this complex by 
closing the mobile clamp of RNAP via subunits E/F. 
During promoter clearance, RNAP loses contact to 
transcription factors (Fig. 5), and TFB and TFE are 
probably released. On strong promoters TBP remains 
bound in complexes containing transcripts of 4 to 
24 nucleotides (109). On weak promoters TBP dis- 
sociates after promoter clearance (34). The nascent 
RNA is directed toward subunits E/F (Fig. 5C) in the 
elongation complex. 

The position of Pyrococcus RNAP on the DNA, 
the extension and location of the transcription bub- 
ble, and the length of the RNA-DNA hybrid in tran- 
scription complexes paused at various positions has 
been analyzed by exonuclease III and potassium per- 
manganate footprinting (96). In complexes stalled at 
position +5, RNAP binds downstream to transcrip- 
tion factors and the binding site of the enzyme ex- 
tends to position +18 (Fig. 6A). The RNAP is in close 
contact with promoter-bound transcription factors 
and therefore an upstream end of the RNAP DNA- 
binding site cannot be defined by this technique. The 
transcription bubble comprises 12 bp and extends 
from —7 to +5 (similar to the bubble found in the 
open complex) (12, 46). The first major transition 
during early transcription is observed in complexes 
stalled at position +6/+7 (Fig. 6B). An upstream end 
of RNAP can now be defined, indicating a conforma- 
tional change of RNAP, but the downstream end of 
the RNAP is still located at position +18. This find- 
ing indicates that 7 nucleotides of RNA can be syn- 
thesized without any translocation of RNAP along 
the template. The transcription bubble of this com- 
plex extends from —7 to +9, indicating that reclosure 
of the DNA of the open complex has not yet occurred 
at this stage of transcription. The second major tran- 
sition occurs at position +10/+11 (Fig. 6C). In this 
complex, translocation of the downstream edge of 
RNAP to position +24 occurs and reclosure of the 
DNA in the region upstream of the transcription ini- 
tiation site begins, indicative of promoter clearance. 
The transcription bubble extends over 17 bp and the 
RNA-DNA hybrid over 9 bp, similar to the extent of 
coverage that occurs in subsequent elongation com- 
plexes. The distance of the active center to the down- 
stream edge of RNAP is ~12 nucleotides. 

The archaeal enzyme appears to be committed to 
elongation at register +10. This transition occurs in 
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Figure 5. Mechanism of transcription by an archaeal RNAP. (A) TFE facilitates binding of the TFB zinc ribbon domain to the 
core domain of RNAP. (B) After open complex formation, the B finger of TFB stabilizes the template strand in the active cen- 
ter of RNAP. TFE provides additional stability to this complex by closing the clamp of RNAP. (C) After synthesis of a transcript 
longer than 10 nucleotides, RNAP reaches the elongation committed state. RNAP moves synchronously with RNA synthesis 
from this point. 
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all multisubunit RNAPs in members of all three do- 
mains of life. One characteristic of the archaeal sys- 
tem is that the distance of the catalytic center of 
RNAP to the front edge of the footprint is 12 bp; this 
is shorter than in bacterial and eucaryal RNAPs, 
where this distance is approximately 18 bp. Between 
positions +11 and +20, RNAP and the transcription 
bubble move synchronously with RNA synthesis. 
There is no evidence for discontinuous translocation 
of the archaeal RNAP in later stages of transcription. 


REGULATION OF TRANSCRIPTION 


Considering the eucaryal nature of the basal 
transcriptional machinery, the detection of bacterial- 
like regulators in the genomes of archaea (1, 55, 56) 
was a surprise. A mosaic archaeal transcriptional ma- 
chinery was proposed, implying the regulation of a 
eucaryal basal machinery by bacterial regulators. 
However, many regulators characteristic of the Ar- 
chaea have been identified and characterized. These 
findings demonstrate that unique pathways of regula- 
tion of transcription exist in the Archaea, and these 
deserve special attention. 


The Lrp Family Contains Paralogs of Bacterial 
Proteins with Repressor and Activator Activities 


Bacterial Lrp homologs have been identified in 
94% of the analyzed archaeal genomes (20). Members 
of the Lrp family are small DNA-binding proteins. In 
E. coli, members of the Lrp family regulate the tran- 
scription of 10% of genes. The E. coli regulon in- 
cludes genes involved in amino acid metabolism and 
in pili synthesis. Binding of bacterial Lrp can repress 
or activate transcription, and binding of its modulator, 
leucine, can either stimulate or reduce the repressor 
or activator activity of Lrp at specific promoters. 

Archaeal paralogs of Lrp were among the first 
biochemically characterized archaeal transcriptional 
regulators (15, 19, 28). The crystal structure of LrpA 
from Pyrococcus has been solved (63). The N-termi- 
nal part of the protein contains a helix-turn-helix mo- 
tif involved in binding to dyad symmetry elements in 
the DNA (78). The N-terminal part is connected with 
a hinge to the C-terminal domain. In bacterial ho- 
mologs this region has been shown to be involved in 
the response to leucine and the activation of tran- 
scription. Similar to bacterial Lrp, archaeal homologs 
tend to form homooligomers. Free LrpA from Pyro- 
coccus exists mainly as a dimer, and DNA-bound LrpA 
exists mainly as a tetramer (19). 

LrpA and Lrs14 from Sulfolobus inhibit transcrip- 
tion of their own genes in vitro. The DNA-binding 


site of LrpA overlaps the RNAP-binding site, and 
DNA-bound LrpA inhibits transcription by blocking 
RNA polymerase recruitment (28). Sulfolobus Lrs14 
binds to the DNA region harboring the TATA box. It 
acts on an earlier step of preinitiation complex as- 
sembly by preventing binding of TBP/TFB (15, 74). 
Aside from autoregulation of transcription in vitro, no 
information is available on the physiological function 
of Sulfolobus Lrs14 and Pyrococcus LrpA. In con- 
trast, a Lrp homolog of M. jannaschii, Ptr2, was found 
to affect transcription of the ferredoxin A and rubre- 
doxin 2 genes, suggesting that it is involved in regula- 
tion of redox reactions. However, unlike LrpA and 
Lrs14, which are repressors of their own promoters, 
Ptr2 is an activator of these two promoters (77). 

Ptr2 conveys a stimulatory effect on transcrip- 
tion by facilitating recruitment of TBP to the promoter. 
Ptr2 binds to two adjacent palindromic sites. Muta- 
tional analysis of this bipartite upstream-activating 
sequence (UAS) showed that the promoter proximal 
site is sufficient for Ptr2-dependent promoter activa- 
tion of the rubredoxin and ruberythrin genes (79). 
The UAS differs significantly from the UAS of the 
bacterial ilvH promoter, which is bound by bacterial 
Lrp. In M. jannaschii six binding sites within a DNA 
segment of 200 bp are involved in binding two Lrp 
octamers, forming a compact “UASome” (51). The 
assembly of archaeal Lrps into octamers and helical 
arrays in crystals (52, 63) led to the proposal of an 
UAS structure in which DNA is wrapped around ar- 
rays of laterally acting protein dimers. However, de- 
spite the potential of archaeal Lrp-like molecules to 
form higher-ordered structures, analysis of the UAS in 
a M. jannaschii cell-free system has shown that a sin- 
gle cis-acting site is sufficient for activation of tran- 
scription. Although a single UAS is reminiscent of 
simple bacterial promoters, analysis of Ptr2-mediated 
activation revealed a clear difference from bacterial 
activators. Ptr2 activates transcription by recruiting 
a eucaryal transcription factor and not by interacting 
with subunits (a and o) of RNAP. Future analyses of 
transcriptional regulators will reveal whether ar- 
chaeal activators have evolved that directly modulate 
RNAP binding. The subunits D and L, which are re- 
lated to a different part of bacterial a, would be the 
candidate binding partners for such a regulator (o 
factors do not exist in the Archaea). 

A second Lrp protein family, represented by 
LysM, also appears to be involved in activating tran- 
scription. It is a part of a gene cluster encoding en- 
zymes for lysine biosynthesis. Transcription of part of 
the lys gene cluster (ys WXJK) is induced upon lysine 
starvation in vivo (18). LysM binds to the promoter 
of lysW. Lysine can modulate the DNA-binding 
properties of LysM. In the absence of lysine, LysM 
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binding to DNA is enhanced. LysM has been sug- 
gested to possibly play a role as an activator of tran- 
scription of genes for lysine biosynthesis. However, 
LysM has not been reported to affect cell-free tran- 
scription of the lys operon. This may be due to the ab- 
sence of a hitherto unknown coactivator in the cell- 
free system. 


Negative Regulation: MDR1, NrpR, TrpY 
MDR1, a metal-dependent repressor 


The function of MDR1, an Archaeoglobus 
fulgidus homolog of the bacterial metal-dependent re- 
pressor DtxR, was studied in the Sulfolobus cell-free 
transcription system and in Archaeoglobus cells (11). 
The gene encoding MDR1 is located upstream of 
three genes encoding an iron-importing ABC trans- 
porter, and all four genes are cotranscribed as a poly- 
cistronic transcription unit. In vivo expression of the 
MDR-1 gene was strictly dependent on metal ion 
availability. In assays based on chromatin immuno- 
precipitation, binding of MDR1 to operator DNA in 
Archaeoglobus cells was found to depend on the pres- 
ence of bivalent cations. MDR1 binds cooperatively 
in a metal-dependent manner to three operator se- 
quences located between positions —18 and +67 rel- 
ative to the transcription start site. MDR1 represses 
transcription from its own promoter in vitro by pre- 
venting RNAP recruitment. This repression also de- 
pends on bivalent cations. DNA binding of the ho- 
molog of MDR1 from Corynebacterium Dxtr is also 
metal dependent. MDR1 was the first archaeal re- 
pressor whose binding to operators was shown to be 
influenced by an inducer in vitro and in vivo. 


The regulator of nitrogen fixation 
in methanogens, NrpR 


Some methanogens utilize dinitrogen as a nitro- 
gen source. A regulator of nif gene expression, NrpR, 
has been characterized in Methanococcus mari- 
paludis. NrpRs from M. maripaludis, M. thermauto- 
trophicus, and M. jannaschii are tetrameric DNA- 
binding proteins. The archaeal NrpR contains an 
N-terminal winged helix-turn-helix domain and two 
conserved domains that may function in dimerization 
or multimerization. Homologs lacking one or more of 
the three domains are present in other methanogens 
and Archaeoglobus and were not found in crenar- 
chaeota, suggesting that NrpR represents a novel 
family of regulators unique to euryarchaeota (64). 

M. maripaludis can utilize ammonia, alanine, 
and dinitrogen as sources of nitrogen. NrpR controls 


the transcription of the nif operon by binding coop- 
eratively to two tandem operator sequences, OR, and 
ORs, located downstream of the transcription start 
site. The stronger and promoter proximal NrpR- 
binding site (OR,) can mediate repression of nif tran- 
scription during growth on ammonia. Both OR, and 
OR, are required for intermediate repression during 
growth on alanine. 2-Oxoglutarate is an intracellu- 
lar indicator of nitrogen deficiency and binds to NrpR 
and lowers its binding affinity to the operators (65). 
Hence, 2-oxoglutarate acts as inducer of nif gene ex- 
pression in archaea and this induction is brought 
about by NrpR. This is the first archaeal system where 
the roles of a repressor, inducer, and two operators 
have been investigated both in vivo and in vitro. 
However, the mode of interaction of NrpR with the 
transcriptional machinery is unexplored. The occur- 
rence of NrpR is restricted to euryarchaeota, but co- 
operative binding of two repressor dimers to tandem 
operators resembles bacterial repressor systems. 
NrpR binds downstream of the transcription start site 
and is likely to inhibit RNAP recruitment. 


Regulation of the tryptophan operon by TrpY 


Tryptophan synthesis is energetically very expen- 
sive, and as a consequence expression of the trp genes 
is tightly regulated. TrpY is a regulator that contains 
an N-terminal helix-turn-helix-DNA-binding motif 
and a C-terminal domain that binds tryptophan as 
an allosteric effector. The M. thermautotrophicus 
trpY gene is transcribed divergently from a promoter 
overlapping the promoter of the trtpEGCFBAD 
operon (110). The TATA boxes of these promoters 
are separated by one helical turn and are therefore 
located on different faces of the DNA helix. TrpY 
binds to four TRP boxes (consensus TGTACA) lo- 
cated in the overlapping promoter region between 
trpY and trpE. In cell-free transcription experiments, 
TrpY has been found to autorepress its promoter in 
the absence of tryptophan while expression of the 
tryptophan operon and a separately encoded gene of 
Trp pathway, trpB2, is only inhibited in the presence 
of tryptophan. TrpY is a dimer in solution and blocks 
transcription at the trpY operators 1 and 2, probably 
by inhibiting RNAP recruitment. Binding of trypto- 
phan to TrpY appears to induce a conformational 
change leading to increased affinity of TrpY to the 
nonconsensus TRP boxes 3 and 4 downstream of the 
TATA box of the trpE gene. In this conformational 
state, TrpY retains its potential to bind to the box 1 
and box 2 consensus operator sequences, thereby reg- 
ulating expression of the trpY gene. The box 3 and 
box 4 TrpY binding sites overlap with the TATA box 
of the trpE and trpY promoter, respectively. The 
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TrpY-tryptophan, repressor-ligand complex acts by a 
different mechanism by competing with TBP binding 
to the TATA box. Due to the close spacing of the di- 
vergent promoters, simultaneous transcription initia- 
tion does not seem possible. The mechanism control- 
ling expression of trp genes and/or trpY is not yet 
understood, although regulation at the level of trans- 
lation has been proposed (110). 

In silico analyses revealed that TrpY paralogs are 
present in other euryarchaeota but not in crenar- 
chaeota (36). TrpY acts superficially like the trypto- 
phan-sensitive bacterial regulator, TrpR (111). How- 
ever, there is no evidence for a common ancestry of 
these proteins (110). 


The Production of Specialized Organelles 
in Haloarchaea: Further Examples 
of Positive Regulation 


In Haloarchaea, the genes involved in formation 
of gas vesicles and the purple membrane are under 
positive gene regulatory control. Fourteen genes (gup 
genes) are involved in gas vesicle synthesis in 
Halobacterium salinarum (48). GvpE is an activator 
of GvpA, the major structural protein of gas vesicles 
(54). GvpE contains a leucine zipper motif similar to 
that found in the eucaryal activator GCN4. However, 
aside from this motif it has no sequence similarity to 
eucaryal transcriptional regulators. GvpE therefore 
appears to represent a type of activator that is unique 
to the Archaea. A site upstream of the BRE sequence 
of one GvpE-activated promoter is involved in tran- 
scriptional regulation (48). Cell-free transcription sys- 
tems are not available for haloarchaea, and the exact 
mechanism of GvpE-mediated activation is unclear. 
However, its activity is inhibited by the GvpD repres- 
sor. GvpE and GvpD interact in vitro, and an inter- 
action of these proteins in vivo may be responsible for 
GvpD-mediated inhibition of GvpE. 

The transcriptional regulator Bat regulates ex- 
pression of genes responsible for the synthesis of pur- 
ple membranes in Halobacterium (6). Synthesis of the 
purple membrane is highly induced in response to 
light intensity and low-oxygen tension. Bat contains a 
photoresponsive cGMP-binding domain (GAF), a 
bacterial AraC-type helix-turn-helix domain, and a 
PAS/PAC domain involved in sensing the redox status 
of the cell. The bop gene directs the expression of the 
major structural protein of the purple membrane. Its 
promoter was thoroughly investigated using mutage- 
nesis (3, 4). The DNA gyrase inhibitor novobiocine 
blocks bop gene induction, suggesting that supercoil- 
ing of DNA is stimulating bop transcription. A DNA- 
supercoiling sensitivity site and an UAS upstream of 
BRE were identified as specific features of this pro- 


moter. The sensitivity of the bop gene to supercoiling 
was correlated to the presence of an alternating 
purine-pyrimidine sequence (RY box) overlapping the 
TATA box of the bop promoter by 4 nt. A bop-like 
UAS, the putative binding site for Bat, is conserved 
upstream of three additional Bat-regulated genes in 
Halobacterium which are presumably involved in 
retinal and carotenoid biosynthesis. A family of reg- 
ulators related to Bat exists in Halobacterium, sug- 
gesting that regulatory networks responding to envi- 
ronmental changes in light and oxygen availability 
have evolved in halophilic archaea (6). 


Regulation of the Heat Shock Response 


Even though the hyperthermophile Pyrococcus 
grows at temperatures higher than 100°C, it has 
evolved a heat shock response to cope with higher- 
temperature fluxes that occur in its natural environ- 
ment (58). The major chaperone classes Hsp100, 
Hsp90/Hsp83, and Hsp70 (DnaK) are absent from 
the genomes of hyperthermophilic archaea, although 
they are present in some mesophilic and psychrophilic 
species (58, 67). The major heat shock proteins pre- 
dicted by a bioinformatics analysis (36), a Hsp60-like 
chaperonin (thermosome), two other chaperones be- 
longing to the AAA* family, and a Hsp20-like small 
heat shock protein, were found to be highly induced 
upon heat shock (57, 93). The AAA* and hsp20 pro- 
moters were investigated in cell-free transcription ex- 
periments. A palindromic sequence overlapping the 
transcription start site (TTT. .T. .C...G. .A. .AAA) 
was identified as a characteristic feature of Pyrococ- 
cus heat shock promoters (Fig. 6). A putative regula- 
tor of the heat shock response, Phr, was found to se- 
lectively inhibit transcription of these templates (103) 
by binding to a conserved, inverted-repeat heat shock 
element. The binding site of Phr overlaps the tran- 
scription start site, and operator-bound Phr inhibits 
RNAP recruitment to the promoter (Fig. 7). 

The crystal structure of Phr has been solved. It 
is a winged helix-turn-helix protein that contains four 
helices in the N-terminal domain with a C-terminal 
domain that is involved in dimer formation and pos- 
sibly in effector binding. Mutational analyses re- 
vealed that amino acids in three helices and the wing 
region are involved in operator recognition, suggest- 
ing a novel mode of DNA-protein interaction. Phr is 
conserved among Euryarchaeota (36). At present it 
is unknown whether an environmental factor (e.g., 
heat), an effector molecule, and/or an additional reg- 
ulator modulates DNA binding of Phr. 

In the genome of Halobacterium strain NRC-1, 
six copies of TBP and seven of TFB (5) have been 
identified. This indicates that alternative TBP-TFB- 
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Figure 6. The two major transitions in archaeal transcription initiation. (A) In the preinitiation complex and during synthesis 
of the first five nucleotides, RNAP is in close contact with transcription factors and the transcription bubble extends from po 
sition —7 to +5. (B) After synthesis of 6/7 nucleotides, the upstream edge of RNAP loses contact with transcription factors 


but the downstream edge is unchanged. (C) At position +10/+11, promoter clearance occurs and RNAP moves continuously 
to enable RNA synthesis. 
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Figure 7. Interaction of a Pyrococcus heat shock regulator (Phr) with heat shock promoters. Phr binds specifically to a con- 
served palindromic sequence of archaeal heat shock promoters overlapping the transcription start site. When bound to the 
DNA, Phr blocks RNAP recruitment. The factors modulating the DNA-binding properties of Phr are unknown. 


RNAP complexes may form and provides an expla- 
nation for the diversity of halophilic promoters and 
the regulation of heat shock response. Multiple copies 
of TBP and TFB also exist in other haloarchaea, and 
two TFB copies are present in Pyrococcus: TFB1 
(identical to TFB), the factor used for cell-free tran- 
scription experiments, and TFB2, not investigated 
thus far. Heat shock-induced upregulation of some 
TFB genes from haloarchaea and of TFB2 from Pyro- 
coccus have been reported (93, 100). Pyrococcus 
TFB2 can replace TFB1 in cell-free transcription ex- 
periments, but Pyrococcus heat shock promoters did 
not show any specificity for TFB2, suggesting that 
TFB2 is not involved specifically in transcription of 
heat shock promoters (M. Micorescu, A. Franke, M. 
Thomm, and M. Bartlett, manuscript in preparation). 

In bacteria, the 5’-untranslated region (5’-UTR) 
of some cold shock-induced genes contains a con- 
served 11-nucleotide-sequence element, referred to as 
the cold box. A 113-nucleotide 5’-untranslated region 
was found upstream of a DEAD-box RNA helicase of 
the Antarctic methanogen Methanococcoides bur- 
tonii (66). This 5'-UTR contains a sequence closely 
matching a bacterial cold-box element. Therefore, 
Bacteria-like regulatory elements in 5'-UTRs seem to 
be involved in cold adaptation in Archaea. In the bac- 
terium Bradyrhizobium, at least five heat shock genes 
are under the control of a conserved 100-nucleotide 
DNA segment (ROSE) positioned precisely in the 5'- 
UTR between the transcriptional and translational 
start sites (76). This cis-acting element confers tem- 
perature control by preventing translation at physio- 
logical growth temperatures. Although ROSE is in- 
volved mainly in the regulation of small heat shock 
proteins that also play a major role in Archaea (57, 
58), there is no evidence that a similar mechanism re- 


lying on a secondary structure of RNA in the 5’-UTR 
is operating in the regulation of heat shock response 
in archaeal cells. 


The Sugar Transport Regulator, TrmB: 
a Molecule Responding to Different 
Ligands in a Promoter-Dependent Manner 


In Pyrococcus, two distinct ABC transporter sys- 
tems exist for the uptake of maltose/trehalose (mal 
genes) and maltodextrins (mdx genes). The expres- 
sion of the operon encoding these ABC transporter 
systems is regulated by TrmB, a global regulator. The 
interaction of TrmB with the malE promoter of the 
maltose ABC transporter system and with the mdxE 
promoter of the maltodextrin ABC transporter sys- 
tem has been studied (61, 62). At the malE promoter, 
TrmB binds to a sequence overlapping the TATA box 
and inhibits cell-free transcription. This is likely to 
occur by inhibiting TBP/TFB binding to the promoter. 
The transcriptional inhibition of malE is reversed by 
maltose or trehalose, which binds to the repressor and 
causes a change in conformation leading to dissocia- 
tion of TrmB from the operator (Fig. 8). When TrmB 
is bound at the mdxE promoter, a different situation is 
encountered. (i) The DNA recognition site is different 
in sequence and location and overlaps the transcrip- 
tion start site (Fig. 8). (ii) The addition of maltose or 
trehalose does not release TrmB from the promoter, 
suggesting that binding to DNA induces a conforma- 
tional change in the protein. In contrast, in cell-free 
transcription reactions, the addition of the substrate 
(maltodextrins) of this transporter system causes 
TrmB to dissociate from the promoter and relieves in- 
hibition of RNA synthesis (Fig. 8). TrmB recognizes 
two different sequences. Its property to respond in a 
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Figure 8. The archaeal regulator TrmB responds to different ligands when bound at different promoters. In the absence of any 
ligands, TrmB binds to its operator sequences at the maltose (mal) and maltodextrin (mdx) promoter. At the mal promoter, 
TrmB binding is influenced by maltose as inducer, but not by maltodextrins. At the mdx promoter maltose has no effect on 
TrmB binding but maltodextrins lower its affinity for the operator. The TrmB-binding sites differ substantially at both pro- 
moters, and the TrmB-binding site overlaps the transcription start site at the mal promoter, and the BRE/TATA box sequence 
at the mdx promoter. The smaller triangle represents maltose, and the larger triangle, maltodextrins. The TATA box is indi- 
cated; the DNA-binding sequence of TrmB is represented by a shaded box and shown on both promoters, and the binding se- 
quence is shown below TrmB. The transcription start site is indicated by +1. The binding site of TrmB contains a palindrome 


at the mal promoter and is represented by two horizontal arrows. Only one half of it is conserved in the mdx promoter. 


different manner to ligands depending on its associa- 
tion with a specific DNA sequence is unique for tran- 
scriptional regulators and reveals that the Archaea 
have evolved regulatory mechanisms that do not ap- 
pear to be present in the Bacteria and Eucarya. 


THE ROMA APPROACH 


The lack of genetic systems in many archaea 
hampers analysis of transcriptional regulation in vivo. 
For example, 9 and 5 copies of Lrp-like genes are pre- 
sent in the genomes of Pyrococcus furiosus and Sul- 
folobus solfataricus, respectively (20). However, the 
only function assigned to Pyrococcus Lrp-A is inhi- 
bition of its own promoter in cell-free transcription 
assays. Analysis of the first complete-genome DNA 
microarray for a hyperthermophilic archaeon re- 
vealed a high degree of coordinated regulation in cells 
of Pyrococcus (92). Of the 2,065 open reading frames 
(ORFs) annotated in the genome of P. furiosus, the 
expression of 125 of them differed by more than five- 
fold between cultures grown with peptides or maltose 
as a primary carbon source (see Chapter 20). How- 
ever it is difficult to infer from these global analyses 
the effects of particular regulators on the modulation 
of transcription. 

A novel approach to identify the targets of regu- 
lators uses fragmented chromosomal DNA as a tem- 
plate for cell-free transcription reactions (Fig. 9). The 
transcripts from this template can be labeled and 
hybridized to microarrays. A comparison of the 
microarray hybridization patterns of transcripts ob- 
tained in the absence and presence of a given regula- 


tor will identify the genes affected by this regulator. Ini- 
tial results (79) indicate that chromosomal DNA from 
archaea can be used successfully as template in cell-free 
transcription reactions. This in vitro transcriptomic 
approach, ROMA (runoff transcription/macroarray 
analysis; 23, 79), may be a useful tool to identify the 
target genes of archaeal regulators. 


EVOLUTIONARY IMPLICATIONS 


The striking similarity of the archaeal and eu- 
caryal genetic machinery described in this chapter 
and the chapters on translation (Chapter 8) and repli- 
cation (Chapter 3) sheds new light on the evolution of 
the eucaryal cell. The commonly accepted theory is 
that the eucaryal cell developed independently after 
separation of three major phylogenetic lineages. The 
lineage leading to Archaea and Eucarya was sepa- 
rated later in evolution, between 1.9 to 1.5 billion 
years ago, whereas the lineage leading to Bacteria 
branched earlier (3.5 to 1.9 billion years ago). This 
implies that Eucarya and Archaea might have a com- 
mon evolutionary history of possibly two billion 
years; this may account for the similarity of their ge- 
netic machinery. But this does not explain how the 
eucaryal cell was generated. Also, the endosymbiosis 
hypothesis provides only an explanation of how mod- 
ern eucaryal cells containing organelles derived from 
free-living Bacteria have evolved from preexisting eu- 
caryal cells. But how was the first eucaryal cell con- 
taining an archaeal transcriptional machinery gener- 
ated in evolution? Did the last common ancestor 
of Archaea and Eucarya already contain the basic 
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archaeal machinery that was later evolved in Eucarya 
to the more complicated machinery of modern Eu- 
carya, including major components such as the TATA 
box containing promoters TBP, TFB, and RNAP? The 
other possibility is that the eucaryal cell was gener- 
ated by a fusion of an archaeal cell, providing the ba- 
sis for the eucaryal nucleus and cytoplasm contain- 
ing the genetic machinery, and a bacterial cell carrying 
the energy-generating electron transport system. An 
elegant theory proposes that the eucaryal cell has 
arisen through symbiotic association of an anaerobic, 
strictly hydrogen-dependent archaeon (possibly a 
methanogen) with a hydrogen- and CO )-producing 
facultative anaerobe able to respire (possibly a pro- 
teobacterium) (70). The hydrogen dependence of the 
latter host (the archaeal cell) provides a strong selec- 
tive force in evolution for irreversible association and 
subsequent incorporation of the hydrogen-producing 
proteobacterium (the symbiont). The resulting prim- 
itive eucaryal cell would have a cytoplasm with an ar- 
chaeal genetic machinery and autotrophic metabo- 
lism and a heterotrophic symbiont generating ATP 
from organic compounds either by anaerobic (and 
possibly later also aerobic) respiration or via fermen- 
tation. This hypothesis provides an ingenious expla- 
nation for the archaeal nature of the eucaryal tran- 
scriptional machinery, as well as a new endosymbiosis 
hypothesis. The host for endosymbiosis of the bacte- 
rial cell was not a differentiated eucaryal cell but a hy- 
drogen-dependent archaeon. If this hypothesis is true, 
then archaeal machinery is not eucarya-like; rather, 
the eucaryal machinery is archaeal as it is derived 
from Archaea. 


PERSPECTIVE: THE NEXT FIVE YEARS 


The archaeal transcriptional machinery is the 
evolutionary ancestor of the more complex eucaryal 
machinery, and recent reports describing the post- 
recruitment function of TFB and TFE suggest that it 
serves as a useful model to address questions that may 
be difficult to address in eucaryal systems. Although 
many regulators resemble bacterial repressors, unique 
regulatory mechanisms have been elucidated. This 
highlights that investigation of regulatory pathways, 
an area still in its infancy, will reveal novel insight 
into the mechanism and the evolution of regulatory 
principles in the Archaea. Future analyses of archaeal 
transcription need to take into account the role 
played by archaeal histones (85) and chromosome-as- 
sociated proteins, such as Alba (10) (see Chapter 4). 
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Chapter 7 


RNA Processing 


GABRIELE KLUG, ELENA EVGUENIEVA-HACKENBERG, ARINA D. OMER, 
PATRICK P. DENNIS, AND ANITA MARCHFELDER 


INTRODUCTION 


Archaea represent a distinct phylogenetic lineage 
that is separate and distinct from Bacteria and Eu- 
carya. Although they lack a nucleus and their gene 
organization, genome structure, and basic central 
metabolism are bacterial-like, their DNA replica- 
tion, transcription, and translation machineries are 
eucaryal-like. This review primarily deals with the 
production and function of the archaeal translational 
machinery and the mechanisms used for posttran- 
scriptional regulation of gene expression. The core 
components of the translational apparatus are repre- 
sented by the three major classes of RNA: mRNA, 
which contains a copy of the genetic information car- 
ried in the DNA that is decoded into the amino acid 
sequence of proteins during translation; tRNA, which 
decodes the information carried on the mRNA and 
aligns amino acids for polymerization; and rRNA, 
which is the core component of the ribosome, the ri- 
bonucleoprotein machine that coordinates the decod- 
ing process and catalyzes the formation of the peptide 
bonds during amino acid polymerization (see Chap- 
ter 8). In addition to the three major classes of RNA, 
Archaea, similar to Bacteria and Eucarya, contain a 
plethora of other types of small RNAs that function 
in the processing, modification, and assembly of the 
translational apparatus or in the control of gene ex- 
pression at the translational level. 


MATURATION OF TRANSFER RNA ENDS 


Most organisms contain between 40 and 50 dif- 
ferent tRNA molecules that read one or more of the 


61 or 62 different sense codons through specific 
codon-anticodon interactions and position the appro- 
priate amino acid for insertion into the growing 
polypeptide chain. Recent data indicate that tRNAs 
are dynamic and have coevolved with the ribosome to 
ensure speed and accuracy in the translation process 
(37, 42, 123). The archaeal genes that encode the dif- 
ferent tRNAs are either individually transcribed, co- 
transcribed with other tRNA genes, or cotranscribed 
with other types of genes (20). In all cases the tRNAs 
are contained within a longer precursor (pre-tRNA), 
which requires nuclease processing at the 5’ and 3’ 
ends of the mature tRNA sequence. In general, the 
3'-terminal CCA sequence present in all tRNAs is not 
encoded in the tRNA gene; after numerous nucleotide 
modifications are introduced into the tRNA, the CCA 
trinucleotide is added by a terminal transferase to 
produce the mature tRNA molecules (70) (Fig. 1). 
Additional complexity in the tRNA processing and 
modification pathway occurs in cases where archaeal 
tRNA genes contain an intron (usually in the anti- 
codon loop of the tRNA). Excision of the intron, and 
exon ligation, is closely linked to modification and 
processing (36). 


Maturation of the tRNA 5’ End by RNase P 


In all organisms the 5’-leader sequence of the 
pre-tRNA is removed by a universally conserved en- 
donuclease, RNase P (EC 3.1.26.5; reviewed in ref- 
erence 55). RNase P is a ribonucleoprotein complex 
composed of a single RNA and one (in Bacteria) or 
more than one (in Archaea and Eucarya) protein sub- 
unit. Most bacterial and archaeal RNase P RNAs are 
structurally similar (type A); the exceptions are Bacil- 
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Figure 1. Maturation of precursor tRNA in Archaea. Precursor tRNAs are transcribed with 5’-leader and 3'-trailer sequences 
that have to be removed to yield a functional tRNA. Processing at the 5’ end is performed by the ubiquitous enzyme RNase P. 
The tRNA 3’-end maturation is catalyzed by the endonuclease tRNase Z; after this the 3’-terminal CCA sequence is added by 
the tRNA nucleotidyltransferase. Some archaeal tRNA precursors contain introns that are removed by the splicing endonu- 
clease, which recognizes the bulge-helix-bulge (BHB) structural motif that forms between the exon/intron and intron/exon 
boundaries. The two halves of the tRNA that are generated by intron excision are joined by a tRNA ligase activity that has 
not yet been identified or characterized. In addition to nucleolytic processing, numerous nucleosides in the tRNA are sub- 
jected to modification. After all these processing and modification steps, the tRNA is ready for aminoacylation. The order of 
processing events is not known, and the scheme depicted is not necessarily what occurs in vivo. 


lus and relatives (type B), Thermomicrobium (type 
C), and Methanocaldococcus and relatives (type M). 
Eucaryal RNase P RNAs are distinct (type E) from 
those of bacteria and archaea (63). 

Under elevated ionic conditions, the RNAs from 
bacteria are catalytically active in vitro in the absence 
of protein (62). Some archaeal type A RNAs exhibit 
similar, albeit weak, ribozyme activity and can be re- 
constituted with the Bacillus subtilis protein to create 
catalytically proficient chimeric holoenzymes (106). In 
contrast, the eucaryal RNAs require the protein com- 
ponents for in vitro catalytic activity. In all organisms, 
the genes encoding both the RNA and protein com- 
ponents of RNase P are essential for viability (105). 

The bacterial RNase P has a single protein subunit 
of rather small size, whereas eucaryal RNase P enzymes 
contain up to ten protein subunits, none of which are 
related to the single bacterial protein (28, 152). In the 
genome of Methanothermobacter thermautotrophicus 
four open reading frames (ORFs) encode proteins with 
similarity to the yeast nuclear RNase P protein subunits 
Pop4 (MTH11), Pop5 (MTH687), Rpp1p (MTH688), 
and Rpr2p (MTH1618). All four M. thermauto- 
trophicus ORFs have obvious homologs in other ar- 
chaeal genomes (63). Thus, the protein subunits of ar- 
chaeal RNase P enzymes are clearly homologous to 
eucaryal RNase P subunits rather than those of bac- 
teria. Based on the homology of protein subunits and 
the similarity of the protein-protein contacts in the 
yeast and archaeal RNase P complexes, it seems likely 
that the structures of the holoenzymes from the Eu- 
carya and Archaea are remarkably similar (64). The 
endonuclease activity that processes the 5’ end of 
tRNAs has been recovered by reconstituting recombi- 
nant protein subunits and in vitro transcribed RNA 
from M. thermautotrophicus and Pyrococcus hori- 
koshii; all four archaeal protein subunits were re- 


quired for activity (17, 84). Taken together, these ob- 
servations imply that the archaeal RNase P RNP is a 
chimera of bacterial and eucaryal parts: the RNA 
subunit of the ribonucleoprotein RNase P more 
closely resembles the bacterial homolog, whereas the 
protein components are related to those found in the 
eucaryal complex. 


Processing at the tRNA 3’ End by tRNase Z 


In Bacteria, two modes of tRNA 3’-end matura- 
tion exist; either the 3’-trailer sequence is digested by 
a combination of exonucleases (49) or the endonucle- 
ase tRNase Z (EC 3.1.26.11, reviewed in reference 
144) cleaves the precursor to remove the trailer (96, 
107). In Eucarya the tRNA 3’-trailer sequence is re- 
moved by the endonuclease tRNase Z, which exists in 
two forms. The short form (which is the one found 
in Bacteria) is about 250 to 350 amino acids in length, 
and the long form is 750 to 900 amino acids in length. 
All archaeal organisms analyzed so far use the short 
form of tRNase Z endonuclease to remove the tRNA 
3’ trailer (126, 127). The enzyme cleaves immediately 
3’ to the discriminator base (the discriminator is the 
first unpaired base, extending on the 3’ end from the 
tRNA acceptor stem) leaving a 3’-hydroxyl group to 
allow immediate addition of the CCA trinucleotide 
by the tRNA terminal transferase enzyme. tRNase 
Z enzymes belong to the metallo-B-lactamase family, 
which is characterized by a specific structural fold, 
two central B-sheets flanked on each side by a-helices 
(6). Members of this family require at least one man- 
ganese, iron, or zinc ion for activity. The Escherichia 
coli tRNase Z requires zinc. The bacterial, archaeal, 
and the two paralogous eucaryal (the long and short 
form) tRNase Z enzymes in general exhibit high se- 
quence similarity to each other. 
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The tRNase Z enzymes from four archaea have 
been isolated: Pyrococcus furiosus, Methanococcus 
jannaschii, Haloferax volcanti, and Pyrobaculum 
aerophilum (96, 127; S. Schubert and A. Marchfelder, 
in preparation). All four have been expressed in E. coli 
and have tRNA-processing activity in vitro. The re- 
combinant halophilic enzyme is inhibited in vitro by 
KCI concentrations higher than 100 mM (Schubert 
and Marchfelder, in preparation). Similarly, extracts 
from H. volcanii that were dialyzed with molecular 
weight cutoff filters of 30 kDa are also inactive in 3’- 
tRNA end processing at high-salt concentrations 
(126). Thus, the tRNase Z is one of a few enzymes 
from a halophilic archaeon that is not active in vitro 
under high salt concentrations. 


Addition of the 3'-Terminal CCA Sequence 


Mature tRNAs from all three domains of life 
carry the CCA sequence at their 3’ termini (Fig. 1). 
The CCA terminal sequence is critical for the base- 
pairing interaction between the 3’ end of the tRNA 
and the A (loop nucleotides around position 2552; 
numbering based on E. coli 23S rRNA) and P sites 
(loop nucleotides around position 2251) within the 
large subunit rRNA (123). Some bacteria and archaea 
encode tRNA genes that include the CCA sequence, 
whereas most of the bacterial and archaeal and all of 
the eucaryal tRNAs require addition of the 3’-termi- 
nal CCA sequence. The enzyme responsible for syn- 
thesis and regeneration of the CCA sequence is the 
ATP (CTP):tRNA nucleotidyltransferase (EC 2.7.7.25; 
also referred to as tRNA terminal transferase); the 
gene encoding this enzyme has been identified in all 
three domains. The tRNA nucleotidyltransferases be- 
long to the superfamily of nucleotidyltransferases that 
also includes poly(A) polymerase and DNA poly- 
merase B. The ability of the tRNA nucleotidyltrans- 
ferase to add specific nucleotides in the absence of a 
nucleic acid template makes it an intriguing poly- 
merase. In organisms where many or most tRNA 
genes do not encode the CCA, the tRNA nucleotidyl- 
transferase gene is essential (35). The first identifica- 
tion of the archaeal homolog was difficult (155), be- 
cause of the low level of sequence similarity to the 
bacterial/eucaryal proteins and the low abundance 
of the protein in cell extracts (39, 155). Mutational 
analysis of the Sulfolobus enzyme showed that there 
is only a single active site for addition of the CCA nu- 
cleotides (35, 156). The archaeal tRNA nucleotidyl- 
transferase from Archaeoglobus fulgidus has been 
cocrystallized with different substrates (tRNA-C, 
tRNA-CC, and tRNA-CCA) (153). The structures 
show that the tRNA acceptor stem remains fixed on 
the enzyme as C7; and A-6 are added (it is not known 


yet whether this is also true for addition of C74), while 
the growing 3’ end refolds to reposition the new 3'- 
hydroxyl group relative to the incoming nucleotide 
and the catalytic nucleotidyltransferase motif. 

The tRNA nucleotidyltransferases are divided 
into two classes (class I and class I) that are distin- 
guished by their amino acid sequences (155). Class I 
tRNA nucleotidyltransferases are present in Archaea, 
whereas class II tRNA nucleotidyltransferases are 
found in Bacteria and Eucarya. In certain deep-root- 
ing bacteria like Aquifex aeolicus and Deinococcus 
radiodurans, CCA-adding is the joint responsibility of 
related class II CC- and A-adding enzymes (138). The 
CC is added by one activity (the CC-adding enzyme) 
and the A is added by a second, closely related activ- 
ity (the A-adding enzyme). 


INTRONS IN ARCHAEAL TRANSCRIPTS 


Introns that disrupt the exon-coding regions of 
genes have been found in all three domains of life. In- 
tronic sequences are transcribed by RNA polymerases 
and are removed from the transcript by endonucle- 
ase excision. The functional RNA is formed by liga- 
tion or splicing of the flanking exons. Three distinct 
splicing mechanisms for (i) group I introns, (ii) group 
II, group HI, and spliceosomal introns, and (iii) ar- 
chaeal introns and eucaryal nuclear tRNA introns are 
known (27, 91, 95, 98, 108). Group I and group III 
introns have not been detected in archaea, whereas 
group II introns have been found only in the Metha- 
nosarcinaceae, members of the Euryarchaeota (40, 
139). In contrast, several archaeal rRNA and tRNA 
genes and a single protein-coding gene (149) have 
been found to contain short intron sequences rang- 
ing from 14 to 106 nt in length that are related to the 
eucaryal nuclear tRNA introns. 


Archaeal Group II Introns 


The Methanosarcinaceae are Euryarchaeota with 
large genomes in the range of 4.1 to 5.75 Mb (48, 
58). They are the only archaea that contain group II 
introns in their genome sequences (40, 139). These in- 
trons are similar to bacterial class D group II introns 
and chloroplast-like class 1 introns. In M. acetivo- 
rans, 21 group II introns were detected, including 
seven that lack an internal ORF (40). The other 14 in- 
trons that contain an internal ORF appear to encode 
reverse transcriptases. The internal ORF is, in gen- 
eral, considered to be a canonical feature of all group 
II introns (40). The ORF-less archaeal group II in- 
trons, together with ORF-less group II introns from 
the cyanobacterium Thermosynechococcus elongatus 
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BP-1, are the only known exceptions to this highly 
conserved feature. 


Introns in tRNA, rRNA, and mRNA Genes 


The introns present in tRNA and rRNA genes 
are variable in sequence but contain a highly con- 
served structural motif (the bulge-helix-bulge or BHB 
motif) that forms between the 5’ exon/intron and 
3’ intron/exon boundaries in the transcribed RNA 
(136). Similar introns ranging in size from 11 to 60 nt 
have been found in eucaryal tRNAs, but these lack 
the BHB structural motif. Eucaryal tRNA introns are 
always located in the anticodon loop between nu- 
cleotides 37 and 38. Archaeal tRNA introns are most 
often located at the same position. They can also ex- 
ist at other locations in the tRNA, however, includ- 
ing the anticodon, 5’ to the anticodon, the amino acid 
acceptor stem, the D loop, the T loop, the anticodon 
stem, and the variable arm (93). In a few cases two in- 
trons have been found in a single tRNA gene (93). 

Many other intron-related anomalies have been 
observed in the genomes of some archaeal species. In 
Nanoarchaeum equitans, several of the tRNA genes 
are split into 5'- and 3’-half genes that are well sepa- 
rated on the chromosome and separately transcribed. 
The trailer of the 5'-half gene and the leader of the 
3'-half gene are partially complementary and form an 
extended helix that includes the anticodon stem of the 
mature tRNA (see Chapters 8 and 9). The imperfect 
helix contains a BHB motif and is believed to be a 
substrate for the splicing endonuclease that initiates 
a trans-splicing reaction generating the mature tRNAs 
(115). In Crenarchaeota, some of the BHB-associated 
introns that occur in the 23S rRNA genes (23, 41, 77) 
are large and contain ORFs that encode a homing en- 
donuclease with a conserved LAGLIDADG motif 
(91). These introns are also excised by the splicing en- 
donuclease (78). In three species of thermophilic and 
hyperthermophilic crenarchaeota, short introns that 
generate the BHB motif at the exon/intron boundaries 
have been detected in the gene encoding the aCbf5 
protein. This protein is a component of archaeal 
H/ACA pseudouridylation guide RNPs (see “Modifi- 
cation of tRNA and rRNA nucleosides,” below). A re- 
verse transcriptase (RT) PCR assay was used to dem- 
onstrate that the introns are removed from the mRNAs 
in vivo (149). 

The BHB motif that forms at the exon/intron/ 
exon junctions of intron-containing archaeal tRNAs 
is necessary and sufficent for cleavage by the archaeal 
splicing endonuclease. In contrast, the eucaryal en- 
zyme uses tRNA structure to recognize the positions 
of intron excision (108, 141). The archaeal introns 
are removed by a two-step mechanism. The splicing 


endonuclease cleaves the intron boundaries at both 
ends, leaving the 5’ exon with a 2’,3’ cyclic phosphate 
end group and the 3’ exon with a 5'-OH end group. 
The tRNA ligase subsequently joins the tRNA exon 
halves to yield the complete tRNA while circularizing 
the intron. At the present time no ligase protein has 
been identified, and it has been suggested that ligation 
may be RNA catalyzed and mediated by the BHB mo- 
tif (45, 115). The eucaryal tRNA-splicing pathway has 
one more step than the archaeal pathway and uses a 
protein ligase. The tRNA-splicing endonuclease per- 
forms a similar reaction, leaving the same end groups, 
whereas the protein ligase joins the tRNA exon halves 
by a complex series of reactions requiring ATP and 
GTP and leaves a 2'-phosphate at the splice site junc- 
tion. A third enzyme, the 2’-phosphotransferase, is re- 
quired to remove the 2’-phosphate. 


The Splicing Endonuclease 


Both the archaeal and eucaryal splicing endonu- 
cleases have been identified (EC 3.1.27.9). Within the 
Archaea, three different forms of the splicing en- 
donucleases have been identified. In Euryarchaeota 
the splicing endonuclease (SE) is either an a -homod- 
imer (e.g., two 37-kDa subunits in H. volcanii) or an 
a4-homotetramer (e.g., four 20-kDa subunits in M. 
jannaschii). In Crenarchaeota (e.g., in Sulfolobus sol- 
fataricus) and Nanoarchaeota the splicing endo- 
nuclease is a heterotetramer consisting of paralogous 
a5B>-subunits (137). The euryarchaeal SE is more 
stringent, recognizing only canonical BHB motifs, 
whereas the crenarchaeal SE also recognizes non- 
canonical BHB motifs (24). In the Eucarya the enzyme 
is an aByd-heterotetramer consisting of a 54-kDa, a 
44-kDa, a 34-kDa, and a 15-kDa subunit (141). The 
eucaryal splicing endonuclease cleaves tRNA introns 
at position 37/38 within a correctly folded tRNA 
molecule. The 44- and 34-kDa subunits are paralogs 
to each other and homologs to the archaeal splicing 
endonuclease a-subunit. 

The archaeal splicing endonuclease can not 
splice eucaryal tRNA introns, but the eucaryal en- 
zyme can recognize and cleave a synthetic RNA sub- 
strate containing an archaeal BHB motif in vitro and 
in vivo (50, 54, 57). The splicing endonuclease may 
have evolved from the more primitive homote- 
trameric a4-type enzyme (93). In addition to the ho- 
motetrameric a4-form, after divergence of Crenar- 
chaeota and Euryarchaeota, a homodimeric form 
[(a-a)2] of tRNA endonuclease may have arisen in the 
Euryarchaeota as a result of a duplication that fused 
two a-like sequences together within a single gene to 
form the larger protein. In Crenarchaeota, the het- 
erotetramer a285-form may have appeared following 
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a duplication of the a-gene and divergence of one of 
the genes to encode the paralogous B-subunit. In the 
Eucarya, a heterotetramer aByd appears to have evolved 
(141) through the retention of the aB-subunits and the 
addition of two new protein subunits. 


The tRNA Ligase 


In addition to the splicing endonuclease, a tRNA 
ligase (EC 6.5.1.3) is involved in tRNA intron splic- 
ing. In the Eucarya, at least three distinct ligase activ- 
ities from yeast, plants, and animals have been ob- 
served (52, 108). In yeast, the enzyme is a monomeric 
92-kDa protein containing three intrinsic activities: an 
N-terminal adenylyltransferase domain that resembles 
T4 RNA ligase 1, a central domain that resembles T4 
polynucleotide kinase (without 3’-phosphate activity), 
and a C-terminal cyclic phosphodiesterase (CPDase) 
domain (5, 109, 125). Yeast tRNA ligase is also re- 
sponsible for nonspliceosomal splicing of mRNA in 
the unfolded protein response pathway (129). In 
plants it is an enzyme of 125 kDa (Arabidopsis 
thaliana), which contains motifs for three intrinsic en- 
zyme activities: an adenylytransferase/ligase domain, 
a polynucleotide kinase domain, and a cyclic phos- 
phodiesterase domain (52). Comparison of the yeast 
and plant ligase sequences reveals no similarities, al- 
though the physical order of intrinsic enzymatic ac- 
tivities has been conserved (52). The broad substrate 
range of plant tRNA ligases clearly suggests that the 
action of this enzyme is not limited to pre-tRNA 
splicing or tRNA repair (52). The archaeal ligase re- 
action appears to be somewhat similar to the animal 
type, because in both cases, the junction phosphate 
is derived from the precursor (159). Neither archaeal 
nor animal enzymes have been sufficiently purified for 
detailed characterization. 


MATURATION OF RIBOSOMAL RNA 


The genes for ribosomal RNA (rrn) are located 
in operons in the archaeal genome and are tran- 
scribed to produce multicistronic precursor RNAs 
(Fig. 2). The number of rRNA operons per genome 
varies between one (extreme thermophiles) and four 
(Methanococcus vannielii) operons per genome (60) 
(see Chapter 8). In general, bacterial rRNA operons 
contain the 16S rRNA gene, one or more tRNA genes 
in the internal transcribed spacer (ITS), the 23S rRNA 
gene, the 5S rRNA gene, and one or more distal 
tRNA genes (111). A similar organization is generally 
observed in Euryarchaeota, where the rRNA operon 
contains the 16S, 23S, and 5S rRNA genes with a 
tRNA4® gene located in the ITS and a tRNA©s gene 


in the distal position (Fig. 2A). In contrast, in the Cren- 
archaeota (e.g., Thermoproteus tenax, Desulfurococ- 
cus mobilis, Sulfolobus acidocaldarius) the 5S and 
tRNA genes are positioned elsewhere in the genome 
and are not part of the rRNA operons (80, 111). In 
euryarchaeal halophiles, the rrn operons are preceeded 
by a series of up to ten tandomly arranged promot- 
ers that serve as the sites for transcription initiation 
and generate transcripts with 5’-external-transcribed 
spacer (ETS) sequences of variable length (44). The 
bacterial and archaeal pre-rRNA transcripts in gen- 
eral contain inverted repeats surrounding the 16S and 
23S RNA sequences that form extended helical struc- 
tures and contain the sites for the initial endonucleo- 
lytic cleavage and excision of pre-16S and pre-23S 
from the primary transcript (Fig. 2B). In E. coli the 
endonuclease responsible for the pre-rRNA excision 
is the helix-specific RNase III. No RNase III-like en- 
zyme has been identified in the Archaea (38). The in- 
verted repeats surrounding the large rRNAs in the Ar- 
chaea contain a BHB motif that is recognized by the 
splicing or BHB endonuclease (29). Extensive conser- 
vation of sequence within the 5’-ETS in halophiles 
suggests that these sequences play an active role in 
processing of the rRNA or assembly of ribosomal 
subunits (46). 

Two early studies used nuclease protection and 
primer extension assays to define the intermediates 
generated during processing of the primary rRNA 
transcript from two canonical rrn operons (typical 
euryarchaeotal rrn operons as described above): the 
single-rrn operon in Halobacterium salinarum and 
the canonical rrnA operon in Haloarcula marismortui 
(Fig. 2B) (29, 47). Those studies demonstrated (i) that 
the BHB motif within the processing stem surround- 
ing the 16S sequence is a site required for the excision 
of pre-16S from the primary transcript; (ii) that exci- 
sion of pre-16S is a prerequisite for RNase P-mediated 
cleavage at the 5’ end of the spacer tRNA" (iii) that 
processing at the 3’ end of the spacer tRNA“"* is en- 
donucleolytic, is temporarily unordered, and can oc- 
cur at any point in the processing pathway; (iv) that 
maturation of the spacer tRNA“" can be disrupted by 
endo- or exonuclease cleavages within the mature 
tRNA sequence; (v) that the maturation of the 23S 
rRNA can occur directly (albeit at low efficiency) 
without endonuclease excision at the BHB site in the 
processing stem; and (vi) that the 5S sequence is ex- 
cised as precursor and trimmed at the 5’ and 3’ ends. 
Moreover, similar characterization of the noncanoni- 
cal rrnB operon of H. marismortui (which differs 
from the canonical rrn operon shown in Fig. 2A since 
it does not encode tRNA“! and it lacks a BHB motif 
in the 16S rRNA-processing stem) demonstrates that 
there is no precursor cleavage in the 5'-ETS preceding 
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Figure 2. Structure of the rRNA operons and processing of the ribosomal RNA precursor. (a) The structure of a typical rRNA 
operon from Euryarchaeota is shown. 16S, 23S, 5S rRNA, tRNA“, and tRNA©*® genes are represented by solid black boxes; 
inverted repeats flanking the 16S and 23S genes are indicated by hatched boxes. Sequences are not drawn to scale. (b) The 
rRNA operon is transcribed as a multicistronic precursor molecule and is cleaved at numerous sites by a variety of endonu- 
cleases (indicated by black arrows and scissors). The splicing endonuclease (SE) cleaves at the BHB motifs to excise the pre- 
cursor 16S and precursor 23S rRNA sequences, and RNase P and tRNase Z remove tRNA“! from the internal transcribed 
spacer region (47). The activities responsible for maturation at the 5’ and 3’ ends of the 16S and 23S rRNAs have not been 
identified or characterized. There is some evidence to suggest that the 5S sequence is excised from the primary transcript as a 
precursor (indicated by black arrows upstream and downstream from the 5S rRNA) and trimmed by a few nucleotides at both 
the 5’ and 3' ends to generate the mature 5S sequence (47). The cleavage site in the anticodon loop of the tRNA“! (indicated 
by a black arrow and 3’) was mapped as a 3’-end site and there was no corresponding 5’-end site at the same position (47). 
Thus this particular 3’ end is either generated by exonucleolytic trimming from the 3’ end of the tRNA, or it is generated by 
an endonucleolytic cleavage in the anticodon loop and the other product containing the tRNA 3’ half is degraded. The pro- 
cessing sites for the tRNA’ have not been mapped experimentally, but tRNA©’S 5’ and 3’ processing is very likely to be per- 
formed by RNase P and tRNase Z, respectively. The 5’-ETS is located upstream of the 16S rRNA, ITS1 is located between 
the 16S and the 23S rRNA, ITS2 is located between the 23S and the 5S rRNA, and the 3'-ETS is downstream of the 5S rRNA. 


the 16S sequence and that excision occurs by en- 
donuclease cleavage at the 5’ end of mature 16S 
rRNA. To summarize, in vivo analysis of rRNA-pro- 
cessing intermediates indicates that when BHB mo- 
tifs are present, they are efficiently used for the exci- 
sion of pre-16S or pre-23S rRNA from the rRNA 
operon primary transcript. 


Cleavage of the BHB-processing sites within pre- 
rRNA seems to be identical to the cleavage of ar- 
chaeal introns. Recent studies employing cloning of 
small noncoding RNAs and PCR reactions across 
cleavage junctions suggest that the BHB cleavage 
products can be ligated to each other, yielding circu- 
larized pre-16S or pre-23S rRNAs and a ligated 


164 KLUG ET AL. 


leader-spacer-trailer molecule (135). These spacer se- 
quences often contain an RNA kink turn (K-turn) 
structural motif, which introduces a sharp turn in the 
backbone structure of the RNA. The aL7Ae protein 
binds to these K-turn motifs that are formed by box 
C- and box D-like sequences (45, 135, 157). These 
box sequences were first recognized as conserved fea- 
tures of eucaryal and archaeal C/D box small RNAs 
that guide 2'-O-ribose methylation in rRNA. The cir- 
cular pre-rRNA intermediates are rapidly processed 
to yield the mature rRNAs. 

Introns are sometimes found in archaeal rRNAs: 
in the 16S rRNA gene of P. aerophilum (23), the 23S 
rRNA genes of D. mobilis (77), and the 23S rRNA 
gene of Staphylothermus marinus (79). Due to the 
presence of the BHB motif that forms between the 
exon/intron and intron/exon junctions, it is expected 
that the ribosomal RNA introns are removed by the 
same splicing endonuclease that removes introns from 
tRNAs and that has been implicated in the excision 
of pre-16S and pre-23S from the precursor rRNA 
transcript (see “Maturation of transfer RNA ends,” 
above). The large free introns derived from pre- 
rRNAs have been observed as stable and abundant 
circular RNAs in certain crenarchaeota. In addition 
tRNA introns have also been found as circular mole- 
cules (91, 122). 


MODIFICATION OF tRNA AND rRNA 
NUCLEOSIDES 


Almost one hundred different modified nucleo- 
sides have been characterized in rRNA, tRNA, mRNA, 
and other RNAs (118), and virtually all of these are 
introduced after transcription as modifications of the 
normal adenosine (A), guanosine (G), cytidine (C), 
and uridine (U) residues. The proportion of modified 
nucleotides in tRNA can approach 50% or more (re- 
viewed in reference 14). Modifications in RNA offer 
an important mechanism for stabilizing the structure 
of the RNA across the entire temperature range of 
natural habitats for microorganisms (85, 99). Some 
RNA modifications are present in all three domains, 
suggesting that these modifications were present in a 
progenitor, whereas others are specific for each do- 
main, suggesting that these modifications have evolved 
after the three domains of life diverged. The two most 
frequent modifications in RNA are ribose methyl- 
ation and the pseudouridylation. In Archaea and 
Eucarya ribonucleoprotein complexes that use base 
complementarity to direct modification to specific 
locations in the target RNA sometimes mediate 
these modifications. The box C/D RNPs guide 2’-O- 
ribose methylation (76, 103) and box H/ACA RNPs 


guide the isomerization of uridine to pseudouridine 
(9, 59, 134). 


Box C/D Methylation Guide RNPs 


There are 67 sites of 2’-O-ribose methylation in 
the rRNA of S. acidocaldarius (94). This number is 
roughly equivalent to the number found in eucarya, 
such as yeast or human, and an order of magnitude 
greater that the number found in bacteria such as 
E. coli. Both Archaea and Eucarya use box C/D RNPs 
to guide the majority of these methylations. In addi- 
tion, archaea also use the RNP guide complexes 
to guide methylation to many positions in tRNAs, 
whereas in eucarya, all of these modifications are 
believed to be mediated by proteins that recognize se- 
quence and structural features of the RNA substrate, 
without a guide RNA. The archaeal RNP complex 
consists of the C/D box sRNA guide and three pro- 
teins: the core RNA-binding protein aL7Ae, the aNop5 
(also referred to as aNop56), and the methyltrans- 
ferase, aFib. The aL7Ae binds directly to box C/D 
RNAs at the K-turn motifs (81) formed by conserved 
box C and D, and box C’ and D’ sequences (32, 86, 
116, 151), and nucleates the further addition of two 
copies of the aNopS and aFib proteins. The aNop5 
protein recognizes features in the RNA K turn that 
are stabilized by the binding of the aL7Ae protein and 
serves as a bridge between the sRNA and the catalytic 
aFib subunit. The aFib protein component of C/D 
box RNPs has an S-adenosylmethionine-binding do- 
main and has been implicated in the methyltrans- 
ferase activity of the complex (1, 45, 104, 116, 140). 
Base pairing of the guide RNA with the target RNA 
positions the substrate nucleotide for 2'-O-methyla- 
tion by fibrillarin (26, 76). The methylation guide 
RNP complexes from Sulfolobus, M. jannaschii, and 
A. fulgidus have been reconstituted from purified re- 
combinant components and are active in in vitro site- 
specific methylation (104, 116, 140, 158). 

In one unusual instance a C/D box methylation 
guide sequence was identified in the intron of the 
tRNA™? gene from several euryarchaeal species (36). 
Subsequently it was shown that the guide sequence 
within the intron was used for methylation of C34 
and U39 of tRNA™? from H. volcanii, and that 
methylation at both positions precedes removal of the 
intron (18, 36, 130). 

Only a single protein 2'-O-methyltransferase, 
which catalyzes the archaeal specific methylation of 
tRNA C56 to Cm56, has been identified (118). It is 
present in all archaeal genome sequences, except for 
the crenarchaeon, P. aerophilum, which employs an 
sRNP to modify C56 (118). 
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Box H/ACA Pseudouridylation Guide RNPs 


In Eucarya there are dozens of pseudouridine 
modifications in rRNA; H/ACA guide RNPs mediate 
their incorporation. In contrast, pseudouridine mod- 
ifications in rRNA are rare in Bacteria and Archaea. 
Bacteria use proteins to introduce these modifica- 
tions, whereas in at least some instances, Archaea use 
the guide H/ACA RNP machines. Four protein com- 
ponents of the archaeal H/ACA RNPs have been 
identified: aCbf5, aGar1, aL7Ae, and aNop10 (19, 
51, 68, 89, 121, 148, 150). Active H/ACA RNPs have 
been reconstituted from recombinant components 
from P. furiosus and Pyrococcus abyssi (8, 31). The 
evidence indicates that aCbf5 is the pseudouridine 
synthase which catalyzes the modification (8, 148). 
The aL7Ae protein is present to stabilize a K-turn mo- 
tif; this stabilized kink is required for optimal activ- 
ity of the complex. The other two components are be- 
lieved to be involved in binding and release of the 
substrate RNA target and are required for optimal ac- 
tivity (8, 31). 

Pseudouridine is a common modification in 
other types of RNAs; most or all of these modifica- 
tions appear to be introduced by psuedouridine syn- 
thetase enzymes that recognize localized sequence and 
structure in the substrate RNA and do not involve 
guide RNPs (30, 82, 101, 102, 154). 


The aL7Ae Binds to Many Small Noncoding RNAs 


The aL7Ae protein is a component of both C/D 
box and H/ACA RNPs. The protein plays a critical 
role early in the assembly pathway of the box C/D 
box RNP complex (104), but not in box H/ACA RNP 
complexes (8). The aL7Ae protein recognizes the 
K-turn motif that is found in many different RNAs from 
all three domains of life (45, 81, 86). In the Archaea, 
K-turn motifs have been shown to occur in rRNA, 
C/D box sRNAs, and H/ACA sno-like RNAs (121), 
and it has been demonstrated that aL7Ae specifically 
interacts with the K-turn motif in all three of these 
RNA classes (10, 86, 121). By employing immuno- 
affinity precipitation with antibodies against aL7Ae, 
Omer and colleagues showed that there is a large and 
diverse class of RNAs in S. solfataricus that interact 
with this protein (157). In addition to the C/D box 
and H/ACA box guide RNAs, the following small 
RNA fragments were recovered from a library screen: 
(i) sense strand mRNA sequences originating from 
within or overlapping with ORFs, (ii) RNAs from 
intergenic regions, (iii) antisense RNAs from within 
or overlapping with sense strand ORFs or C/D box 
sRNAs, (iv) internal regions of 7S RNA, and (v) frag- 
ments of rRNAs and tRNAs. Hiittenhoffer and col- 


leagues (134) have constructed a similar library of 
nonselected, small, noncoding RNAs from S. solfa- 
taricus. Many of the clones from this library overlap 
with the Omer library (157), and many contain C- and 
D-box sequences and are able to bind the aL7Ae pro- 
tein. The antisense noncoding RNAs (ncRNAs) from 
these complementary libraries may be important 
posttranscriptional regulators of gene expression, al- 
though the mechanism of action has not been inves- 
tigated (157). 


PROCESSING OF mRNA 


The processing of mRNA in Bacteria and Eu- 
carya is highly ordered and contributes to the regula- 
tion of gene expression. Half-lives can vary consider- 
ably among individual mRNA species. The stabilities 
of several bacterial transcripts depend on external 
factors such as temperature or oxygen tension (re- 
viewed in reference 133), whereas the stabilities of eu- 
caryal transcripts vary in response to cellular stimuli 
and stage of differentiation (reviewed in reference 
120). In bacteria the half-lives of individual segments 
of polycistronic transcripts can vary considerably and 
contribute to the nonstoichiometric production of 
operon-encoded proteins (61, 117, 131). 

The organizational structure of bacterial and eu- 
caryal mRNAs differ significantly (Fig. 3). Eucaryal 
mRNAs are mostly monocistronic and may contain 
introns that disrupt coding regions and have to be re- 
moved by splicing. The 5’ end of the transcript is 
modified by the addition of a methylated guanosine 
cap, and the 3’ end contains a long nontemplated 
poly(A) tail that stabilizes the transcript. In contrast, 
bacterial mRNAs are often polycistronic, have a 
triphosphate at the 5’ end, and occasionally contain 
a short nontemplated poly(A) tail at the 3’ end. The 
5'-triphosphate provides protection against the en- 
doribonuclease RNase E, whereas processing prod- 
ucts with 5'-monophosphates are attacked by RNase 
E unless hidden within a region of stable secondary 
structure (92). The short poly(A) tail at the 3’ end 
of bacterial mRNAs contributes to degradation by 
providing a platform for the loading of exoribonu- 
cleases that can then overcome stable secondary 
structures that normally protect the RNA from degra- 
dation (87). 

Similar to bacterial mRNAs, archaeal mRNAs 
are often polycistronic, and no specialized cap has 
been found at the 5’ end (21, 22). The mechanisms 
of mRNA decay in the Archaea are presently poorly 
understood, but the existing data suggest that Ar- 
chaea combine bacterial and eucaryal features of 
mRNA processing and degradation, and these mech- 
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Figure 3. Principles of mRNA decay in Bacteria and Eucarya and model for mRNA decay by exosomes in Archaea. The addi- 
tion of poly(A) destabilizes transcripts in Bacteria, and in the nucleus of Eucarya. Exonucleolytic activity of the exosome has 
been demonstrated, but the involvement of endonucleases in mRNA turnover is still uncertain. Differential stabilities of mRNA 
segments can control protein levels in Bacteria and Archaea. Differently shaded bars indicate different ORFs in Bacteria and 
Archaea, and introns and exons in Eucarya. Short black bars indicate endonucleolytic cleavage sites in bacterial mRNA. En- 
doribonucleases are symbolized by scissors, exoribonucleases by “pacmen,” the exosome by multiple “pacmen.” The black 
hair-pin structures represent stabilizing mRNA secondary structures. 


anisms differ significantly among the members within 
the domain. 


Stability of mRNA 


In general, the half-lives of bacterial mRNAs are 
much shorter than those of eucaryal mRNAs. Recent 
studies have used microarray technology to determine 
the respective half-lives of all expressed mRNAs in 
E. coli (12, 128), B. subtilis (65), and Saccharomyces 
cerevisiae (147). In all these organisms a wide range 
of stabilities was found for individual mRNAs. More 
than 80% of the E. coli and B. subtilis mRNAs ex- 
hibited half-lives in the range of 3 to 8 min, whereas 
in S. cerevisiae a mean half-life of 23 min was calcu- 
lated (147). 

Little is known about the stability of mRNAs in 
Archaea. In an early study, the growth of M. vannielii 
was inhibited by addition of bromoethanesulfonate 
or by the removal of hydrogen, and the decay of sev- 
eral mRNA species was quantified by Northern blot 
analysis (67). Under these conditions, the half-lives of 
the secY, mcr, mva, and argG transcripts varied from 
7 to 57 min. 

Two, more recent studies have used actinomycin 
D to block transcription in S. solfataricus or Halo- 


ferax mediterranei (13, 73). In S. solfataricus, the fol- 
lowing mRNA half-lives were determined by North- 
ern blot analysis: tfb1 (transcription factor TFIIB par- 
alog), 2 h; sod (superoxide dismutase), 2 h; dgh1 
(glucose dehydrogenase), 37 min; malA (a-glucosi- 
dase), 25 min; tfb2 (transcription factor TFIIB), 13.5 
min; gln1 (one of three glutamine synthetases), 6.3 min 
(13). These data suggest that some archaeal mRNAs 
have significantly longer average half-life than most 
mRNAs from E. coli and B. subtilis. The sequences or 
structural features that contribute to these apparent 
differences in half-life and the pathway for degrada- 
tion have not been identified. 

In H. mediterranei, the divergently transcribed 
gupACNO and gupDEFGHIJKLM operons encode 
proteins required for gas vesicle formation, the inter- 
cellular structures that are used to regulate the buoy- 
ancy of cells in highly saline environments. All of the 
full-length transcripts and detectable transcript frag- 
ments derived from the two operons use either the 
gupA or gvpD promoters, and all share the common 
5'-promoter proximal sequences. Full-length operon 
transcripts are rare, and shorter transcripts with 3’ ends 
more proximal in the operons are always more abun- 
dant than longer transcripts with 3’ ends located more 
distal in the operons (119). It is unclear if the mRNA 
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fragments from the two operons are the result of pre- 
mature transcription termination within the operons 
or from endonuclease cleavage and differential stabil- 
ity (as shown for a bacterial polycistronic transcript 
in Fig. 3). In an attempt to distinguish between these 
two alternatives, actinomycin D was used to inhibit 
transcription and Northern hybridization was used to 
follow the decay of the various mRNA fragments. 
The half-lives of the gvp transcript fragments varied 
from 4 min to 80 min under different conditions of 
growth. There was a strong correlation between frag- 
ment length and stability: the shorter the fragment, 
the longer the half-life. This correlation was used to 
argue that the transcript fragments are generated by 
endonucleotic cleavage rather than premature tran- 
scription termination, and that the distal sequences 
released following cleavage are selectively degraded. 
This results in a directional 3’ to 5’ degradation of the 
mRNAs. The most stable and most abundant tran- 
script fragment was the gypA mRNA encoding the 
major gas vesicle structural protein. Overall, this study 
suggests that selective transcript stability contributes 
to the regulation of gas vesicle gene expression. At the 
present time there is no indication of what the signals 
for endonuclease cleavage within the mRNA transcripts 
might be and how the proximal fragment might be 
stabilized and the distal fragment destabilized. 

In both the S. solfataricus and the H. mediter- 
ranei studies, the interpretation of the data is com- 
plicated by the use of actinomycin D as an inhibitor 
of transcription. Actinomycin D inhibits transcription 
and DNA replication fork progression by intercalat- 
ing into DNA. If intercalation occurs at random, it 
means that longer transcripts are more susceptible to 
inhibition than shorter transcripts. Moreover, uridine 
incorporation was used to monitor the efficacy of in- 
hibition. It is well known from studies in bacteria that 
uridine incorporation is efficient only when rRNA 
synthesis is high and there is a net accumulation of 
RNA in the culture. Since rRNAs are derived from 
the transcription of very large operons (~5,000 nt), 
rRNA synthesis is extremely sensitive to inhibition by 
actinomycin D. Indeed, there may be synthesis of 
shorter promoter proximal mRNA fragments even in 
the absence of detectable uridine incorporation. At 
this point the data should be taken as suggestive; a 
more comprehensive understanding will require new 
technological advances. 


Bacterial Type mRNA-Degrading Enzymes 


In E. coli, and probably other gram-negative 
bacteria, the endoribonuclease RNase E plays a very 
important role in mRNA turnover by catalyzing the 
initial endonuclease cleavage of mRNA molecules 


(87). An RNase E-like RNA degrading activity has 
been reported in halophilic Archaea (56), but the cor- 
responding protein has not identified. Archaeal 
genomes encode proteins with very limited sequence 
similarity to the bacterial endoribonuclease RNase E. 
One of these proteins, FAU-1 from P. furiosus, binds 
to an AU-rich RNA sequence (75); whether FAU-1 
or other RNase E homologs exhibit RNase activity 
remains to be determined. Some archaeal genomes 
encode RNase R homologs. RNase R is a 3’ to 5’ ex- 
oribonuclease that is involved in bacterial rRNA and 
tmRNA (an RNA that functions as both mRNA and 
tRNA during the process of translation) processing 
and mRNA turnover (e.g., 33, 34, 69). The role of 
RNase R in mRNA decay in members of the Archaea 
is currently under investigation. 


An Exosome-like RNA-Degrading 
Complex in Archaea 


In Bacteria and Eucarya, large multiprotein com- 
plexes are involved in mRNA turnover. The degrado- 
some consists of endoribonuclease, exoribonuclease, 
and helicase activities and has been characterized 
in E. coli (25), Rhodobacter capsulatus (74), and a 
psychrophilic strain of Pseudomonas syringae (112) 
(Fig. 4). The central component of the complex is the 
endoribonuclease RNase E, which is responsible for 
the cleavage that initiates the degradation of most 
mRNA molecules. In E. coli PNPase (polynucleotide 
phosphorylase), a major 3'- to 5'-exoribonuclease is 
associated with the degradosome, whereas in P. sy- 
ringae RNase E interacts with the hydrolytic 3’- to 
5'-exoribonuclease RNase R. These activities act 
on the natural mRNA 3’ end or the ends generated by 
endonuclease cleavage by RNase E. The helicase 
component presumably functions to unwind mRNA 
secondary structure and allow processive 3'-exonuclease 
degradation. These RNA-degrading complexes have 
however been identified only in gram-negative bacteria. 
Additional enzyme components found in degrado- 
some complexes are enolase and polyphosphate ki- 
nase in E. coli (16, 25), and the transcription termi- 
nation factor, Rho, in R. capsulatus (74). 

The eucaryal exosome has been implicated in 
rRNA processing and mRNA degradation (2, 3, 114). 
The eucaryal exosome consists of ten essential pro- 
teins known or predicted to possess 3'- to 5'-exonu- 
clease activity (43, 97, 142) (Fig. 4). Six of these pro- 
teins (Rrp41 to 43, Rrp45 to 46, and Mtr3) contain 
an RNase PH domain (RPD) typical for phospho- 
rolytic exoribonucleases. The hydrolytic exoribonu- 
cleases Rrp4 and Rrp40 are similar to each other and 
to another exosome component, the RNA-binding 
protein Csl4. The hydrolytic exoribonuclease Rrp44 
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Figure 4. RNA-degrading protein complexes in members of the three domains of life. The degradosome of gram-negative 
bacteria is organized around endoribonuclease E. Association of RNase E with an exoribonuclease and one or more helicases 
seems to be conserved. The exoribonuclease PNPase consists of a trimer, each of which contains two RNase PH domains, one 
KH- and one S1 RNA-binding domain. A similar hexameric structure was found for the central ring of the exosome in Archaea 
and Eucarya, which is composed of six subunits with RNase PH domain. The exosome subunits Rrp4 and Rrp40 show the typ- 
ical KH- and $1 RNA-binding domains of hydrolytic exoribonucleases. Rrp44 and Csl4, and RNase R comprise the $1 DNA- 
binding domain. In the Sulfolobus exosome core, Rrp41 is catalytically active while Rrp42 is not but contributes to the struc- 


turing of the Rrp41 active site. 


resembles the bacterial RNase R. This core exosome 
interacts with additional proteins, including helicases 
in both its cytoplasmic or nuclear form. The bacter- 
ial PNPase is a trimer of subunits, each containing two 
RPD domains that form a hexameric ring surround- 
ing a central pore (132). It was proposed that the six 
RPD domains of the eucaryal exosome are arranged 
in a similar hexameric structure (4, 113). 

An analysis of archaeal genomes revealed the 
presence of orthologs of exosome proteins in all se- 


quenced species, with the exception of halophiles and 
M. jannaschii (83). The existence of a eucaryal-like ex- 
osome was demonstrated for S. solfataricus (53). Four 
of the Sulfolobus exosome proteins have counterparts 
in the eucaryal exosome (Rrp4, Rrp41, Rrp42, Csl4). 
An additional protein with sequence similarity to the 
bacterial DnaG primase was identified as a subunit of 
the Sulfolobus exosome. Based on the subunit compo- 
sition, a hexameric ring of three Rrp41 and three 
Rrp42 proteins was suggested. This conformation 
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of the S. solfataricus complex was recently confirmed 
by reconstitution experiments (146a) and by resolving 
the crystal structure at 2.8-A resolution; the complex is 
active in the degradation of mRNA (90). 

It has not been determined whether the archaeal 
exosome has only exoribonucleolytic activity or 
whether it also serves to organize proteins with endo- 
nucleolytic activity in vivo (Fig. 4). Structure-guided 
mutagenesis studies revealed that the activity of the 
complex resides within the active sites of the Rrp41 
subunits, all three of which face the same side of 
the hexameric structure. The Rrp42 subunit is inac- 
tive but contributes to the structuring of the Rrp41 
active site (90). 


Polyadenylation of Archaeal mRNAs 


Most eucaryal transcripts carry poly(A) tails of 
~200 nt. The decay of eucaryal polyadenylated mRNA 
is initiated by a shortening of the poly(A) to about 
15 nt and is followed by cap removal and 5'-3'-exo- 
nucleolytic decay (11, 72). More than 20% of E. coli 
mRNAs carry poly(A) tails of 10 to 40 nt; this feature 
contributes to the destabilization of these mRNAs (15, 
100, 124). Poly(A) tails are also added to mRNA frag- 
ments that result from RNase E cleavage of full-length 
mRNAs (66). There is evidence that PNPase and 
RNase E are involved in poly(A) removal (71, 146). 
RNase E-catalyzed poly(A) removal is an endonucle- 
olytic process regulated by the phosphorylation status 
at the 5’ end of the mRNA transcript. The rate of 
poly(A) removal by RNase E is much higher when the 
5' end carries a monophospate group, compared with 
substrates with triphosphate at the 5’ end (146). 

Some archaeal mRNAs were reported to carry 
short poly(A) tails (21, 145). The role of these tails 
in mRNA decay has not been studied. The eucaryal 
poly(A) polymerases are related to the archaeal CCA- 
adding enzymes (7), suggesting that these archaeal en- 
zymes possibly have a second function as poly(A) 
polymerases. More recent analyses have demonstrated 
the presence of poly(A) tails up to 30 nt in length in 
S. solfataricus mRNAs, but no poly(A) tails have been 
detected in H. volcanii mRNAs (110). The same 
study revealed that the exosome-like complex of 
S. solfataricus constitutes the major polyadenylating 
activity in this organism. Polyadenylation is catalyzed 
by the same Rrp41 active sites that are used for degra- 
dation (90). Ongoing studies are addressing whether 
polyadenylation exists in other branches of the Ar- 
chaea, and which enzymes catalyze polyadenylation 
and degradation. 

Recent studies revealed that polyadenylation of 
transcripts by the TRAMP complex in the nucleus 
stimulates degradation by the exosome (88, 143). 


This suggests that the ancestral role of poly(A) addi- 
tion was stimulation of exonucleolytic degradation, 
and this function is maintained in modern Bacteria 
and Archaea. 


PERSPECTIVE: THE NEXT FIVE YEARS 


In Archaea (as well as Bacteria and Eucarya) our 
understanding of the processing, modification, and 
maturation of the major RNA components of the 
translation machinery (i.e., tR NA and rRNA) is, in 
many ways, still rudimentary. Nevertheless, the infor- 
mation that is known for rRNA and tRNA is serving 
as a foundation for understanding the processing, 
maturation, and degradation of other, less well char- 
acterized classes of RNA. In recent years, there has 
been a revolutionary realization that organisms con- 
tain a very large number of small ncRNAs that play 
an essential role in the regulation of gene expression 
at various levels. Many of these ncRNA molecules are 
likely to be synthesized as precursors and subjected to 
processing and maturation. While still poorly under- 
stood, recent evidence suggests that they may control 
the function, stability, and turnover, including RNA- 
quality control, of stable RNAs and mRNAs. There is 
a high likelihood that many of these issues will be re- 
solved in the near future and that many of these 
processes will be defined within the context of the 
three-dimensional structure of the cell. It will be in- 
teresting to see whether Archaea contain subcellular 
structures that are the progenitor to eucaryal RNA- 
processing and assembly structures, such as the nu- 
cleolus, cajal bodies, and P bodies. 
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Chapter 8 


Translation 


PAOLA LONDEI 


INTRODUCTION 


Translation is the last step of the gene expression 
pathway, whereby the genetic message carried from 
the DNA in the form of messenger RNA (mRNA) is 
converted into the final product: a polypeptide chain. 
The synthesis of a protein is an extraordinarily com- 
plex process, which requires the participation of a 
host of small and large molecules that are centered 
around what is perhaps the cell’s most wonderful 
macromolecular machine, the ribosome. The impor- 
tance of the translational apparatus in the cellular 
economy is perhaps best underscored by the fact that 
up to one-fourth of the overall metabolic energy in an 
actively growing cell is devoted to the synthesis of 
translational components. 

Being a fundamental cellular process, translation 
is well conserved across the three domains of life. It 
is now generally accepted that the translation appa- 
ratus must have already attained a fair degree of com- 
plexity in the Last Universal Common Ancestor 
(LUCA) of extant cells. The LUCA was likely to have 
been endowed with ribosomes that are very similar 
in structure and function to present-day ribosomes, in 
addition to having tRNAs, aminoacyl-tRNA syn- 
thetases, and some translation factors that are also 
similar to those of extant cells (4, 18, 57, 61, 81). 
However, there are also important differences in com- 
position and complexity of the protein synthesis ma- 
chinery of present-day cells in the three domains of 
life. Until recently, conventional wisdom stated that 
there were two different versions of the translation 
apparatus: a simple, streamlined form present in the 
Bacteria and a more complex form found in Eucarya. 
This dichotomy seemed to be the logical result of the 
different organization and lifestyles of “eukaryotic” 
and “prokaryotic” cells (see Chapter 1). According to 
this logic, the “simpler” bacteria, whose basic evolu- 


tionary strategy consisted of maximizing the velocity 
of growth and multiplication, had gene expression 
machinery made of fewer and often smaller compo- 
nents. In contrast, the complexity of Eucarya (e.g., 
metazoans) naturally led to the expectation that they 
would have a sophisticated and complicated gene ex- 
pression apparatus. 

The discovery of the third domain of life, the Ar- 
chaea, has challenged in many ways the classical text- 
book dichotomy between “prokaryotes” and “eu- 
karyotes” (119). As far as cellular organization is 
concerned, all known archaea are unicellular pro- 
karyotes. However, a host of phylogenetic, molecular, 
biochemical, and genomic studies have now shown 
that the Archaea are clearly distinct from the other 
prokaryotic order, the Bacteria, and are instead 
specifically related to the Eucarya. The affinity be- 
tween the Archaea and the Eucarya is particularly ev- 
ident in the structure of the translation apparatus, to 
the extent that, far from being “simple prokaryotes,” 
the Archaea show an unexpected complexity in some 
aspects of translation, and this is currently one of the 
more puzzling and challenging topics in need of ex- 
perimental investigation (11, 30). Due to this, the 
study of the translational machinery in archaea is not 
only interesting in its own right but also provides op- 
portunities for gaining new insight into the evolu- 
tionary history of the protein synthesis mechanism. 

The importance of translation to cellular func- 
tion can not be overstated. The emergence of trans- 
lation as a process was key to the evolution of mod- 
ern cellular life (118). Primitive “life” based on 
self-replicating nucleic acids without translation is 
conceivable. For example, the hypothesis of an 
“RNA world” posits the existence of RNA-based an- 
cestral entities, entirely without a proteome (20, 47, 
68, 88). However, successful self-propagating cells 
were created only with the emergence of translation, 
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which established a stable link between a nucleic 
acid-based genotype and a protein-based phenotype. 
It is very unlikely that we will ever completely un- 
derstand how translation came into existence. How- 
ever, the discovery of the Archaea has rekindled in- 
terest in the study of early cellular evolution and has 
made it possible to tackle the problem of the evolu- 
tion of translation from a novel vantage point (118). 
This chapter describes what is known about the 
translational apparatus and the protein-synthesis 
mechanism in archaea. It also highlights how this un- 
derstanding increased and will further lead to a 
greater understanding of the evolution of translation 
in all forms of life. 


COMPOSITION AND STRUCTURE OF THE 
ARCHAEAL TRANSLATION APPARATUS 


General Overview 


The composition of the translation apparatus is 
basically the same in all cells and cell organelles. The 
ribosomes are large ribonucleoprotein complexes 
composed of two unequal subunits. In archaea, the 
small ribosomal subunits sediment at about 30S and 
contain a single RNA molecule of about 1,500 nt 
(small-subunit [SSU] rRNA) and 25 to 28 proteins, 
depending on the organism. The large ribosomal sub- 
units sediment at about 50S and contain a large RNA 
molecule of about 3,000 nt (large-subunit [LSU] 
rRNA), a smaller one of about 120 nt (5S RNA), and 
35 to 40 proteins, depending on the organism. 

Other essential components of the protein syn- 
thesis machinery that are found in all cells are spe- 
cific sets of proteins known as translation factors. 
These are necessary to assist the different stages of 
translation, i.e., initiation, elongation, and termina- 
tion. To date, computational analyses of about 20 
complete archaeal genome sequences have identi- 
fied six putative initiation factors (IFs), two elon- 
gation factors (EFs), and one release or termination 
factor (RF). It is possible, however, that other fac- 
tors will be identified in the future through the 
progress of biochemical and genetic studies of ar- 
chaeal translation. 

Comparative genomic analyses have also re- 
vealed that the three domains of life possess a com- 
plex set of tRNAs and aminoacyl-tRNA synthetases. 
Archaeal tRNAs and their charging enzymes are 
covered in Chapters 7 and 9. Other proteins that 
participate in the synthesis of the translation appa- 
ratus or in the posttranslational processing of pro- 
teins are described in Chapter 7 (rRNA processing 
and modifying enzymes) and Chapter 10 (chaper- 
ones and proteasomes). 


Archaeal Ribosomes 


Historical overview 


The basic architecture of the ribosome is evolu- 
tionarily conserved. Nevertheless, it has been known 
for many years that domain-specific differences exist 
between bacterial and eucaryal ribosomes. The latter 
are bigger and compositionally more complex than 
the former; their component rRNAs are larger by sev- 
eral hundred nucleotides, and they typically contain 
more and somewhat larger proteins. The structural 
differences between bacterial and eucaryal ribosomes 
are also underscored by their differential sensitivity to 
ribosome-targeted antibiotics. 

Biochemical and structural analyses of archaeal ri- 
bosomes started in the eighties, soon after the discov- 
ery of the third domain of life. Note that the discov- 
ery of the Archaea can be attributed to the field of 
ribosomal studies, since it stemmed from the recogni- 
tion that the sequences of archaeal SSU rRNAs formed 
a coherent cluster sharply separated from the equally 
coherent clusters formed by the bacterial and eucaryal 
SSU rRNA sequences (120) (see Chapters 1 and 2). 

The gross chemical composition of archaeal ri- 
bosomes from different species of Crenarchaeota and 
Euryarchaeota was initially analyzed by many tech- 
niques such as gel electrophoresis and density gradi- 
ent sedimentation (112). The general finding was that 
the ribosomes and ribosomal subunits of most ar- 
chaea had sedimentation coefficients of 30S, 50S, and 
70S and included 16S, 23S, and 5S rRNAs, similar to 
their bacterial counterparts. However, the ribosomes 
of certain archaea contain more proteins than bacte- 
rial ribosomes; this was the case, in particular, for the 
small subunit (75). A larger number of proteins was a 
characteristic of sulfur-dependent thermophiles, while 
halophiles and methanogens had ribosomes more sim- 
ilar in composition to their bacterial counterparts (3). 
Observations made by electron microscopy also showed 
that the ribosomes of sulfur-dependent thermophiles 
had morphological characteristics similar to those of 
eucaryal ribosomes (66-67). This was particularly the 
case for the small ribosomal subunits, which pos- 
sessed a “bill” on the head and “lobes” of the body, 
similar to those observed in eucaryal, but not bacter- 
ial, counterparts. The fact that these features were ab- 
sent in the ribosomes of halophiles and some 
methanogens led Lake and coworkers to propose that 
the sulfur-dependent archaea constituted a separate 
order (referred to as the “eocytes”) specifically related 
to eucarya (96). Although the “eocyte” hypothesis 
proved untenable, data accumulated since the begin- 
ning of the genome-sequencing era have essentially 
confirmed that archaeal ribosomes have a composi- 
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tion that is more similar to eucaryal ribosomes, de- 
spite the overall size of the archaeal ribosomes being 
more similar to bacterial ribosomes. 


Archaeal rRNAs 


A survey of the rRNAs present in 21 fully se- 
quenced archaeal genomes is shown in Table 1. The 
sizes of archaeal SSU RNA range from about 1430 
to ~1,500 nt, while LSU rRNAs are ~2,900 to 3,100 
nt in size. However, “extra-large” rRNAs are observed 


Table 1. Size? and number? of genes for ribosomal RNAs 


in Archaea 
Organism 23S 16S 5S 5.88 
Crenarchaeota 
A. pernix 4,413¢ 1,423 119 167 
132 
S. solfataricus 3,049 1,496 121 
S. tokodati 3,012 1,445 125 
P. aerophylum 3,024 2,2104 130 
Euryarchaeota 
Archeoglobales 
A. fulgidus 2:931 1,491 123 
Halobacteriales 
H. marismortui 2,921 (3) 1,471 (3) 121 (3) 
Halobacterium sp. 
NRC-1 2,905 (3) 1,472 (3) 122 (3) 
Methanobacteriales 
M. thermauto- 
trophicus 3,028 1,478 (2) 126 
3,034 128 
Methanococcales 
M. jannaschii 2,889 1,474 119 (2) 
2,948 1,477 119 
M. maripaludis 2,956 (3) 1,391 (3) 114 (4) 
Methanosarcinales 
M. mazei 2,892 (3) 1,473 (3) 134 (2) 
132 (1) 
M. acetivorans 2,831 1,429 (3) 134 (2) 
2,848 132 (1) 
2,948 
Methanopyrales 
M. kandleri 3,097 1,511 132 
Thermococcales 
P. furiosus 3,048 1,446 125 
121 
P. horikoshii 3,857 1,494 121 (2) 
P. abyssi 3,017 1,502 121 (2) 
T. kodarakaensis 3,028 1,497 125 (2) 
Thermoplasmata 
T. acidophilum 3,044 1,470 122 
T. volcanium 2,906 1,469 122 
P. torridus 2,895 1,468 120 
Nanoarcheota 
N. equitans 2,861 1,344 122 


“Number of nucleotides in each rRNA gene. 

’Number of genes, if more than one, is indicated in parentheses, except when 
the genes are of different length, in which case their size is shown in full. 

“Gene containing two introns. 

4Gene containing one intron. 


in some archaea. The aberrantly large sizes are due 
to the presence of introns, which are found, although 
infrequently, in both the SSU and LSU rRNA genes. 
For example, the 2,210-nt 16S RNA gene of Pyro- 
baculum aerophylum includes a 713-nt intron. Like- 
wise, the LSU rRNA gene of Aeropyrum pernix, 
which measures 4,413 nt, is endowed with a 202- and 
a 575-nt intron. A. pernix strain K1 also harbors an 
intron in the SSU rRNA, making it the one species 
that possesses introns in both the LSU and the SSU 
rRNAs (84). Other archaeal genera with rRNA in- 
trons are Desulfurococcus and Thermoproteus (45, 
56). Intron splicing is carried out by an archaeal splic- 
ing endonuclease that recognizes a characteristic sec- 
ondary structure, described as the helix-bulge-helix 
motif (77). 

The small 5S rRNA is rather constant in size in 
all archaea, ranging from a minimal length of 119 nt 
to a maximum of 134 nt. Notably, A. pernix is unique 
in apparently containing a 5.88 RNA homolog of 167 
nt in addition to two different 5S RNA genes of 119 
and 132 nt. 

The primary and secondary structures of the 
rRNAs are extremely well conserved throughout evo- 
lution. The secondary structures of archaeal LSU and 
SSU rRNAs are almost superimposable on their bac- 
terial counterparts. Only very subtle differences are 
evident, and these consist of the occasional addition, 
lengthening, or shortening of hairpin elements of he- 
lical segments (Fig. 1). 


Ribosomal Proteins 


The availability of genome sequences has en- 
abled a detailed picture of the protein composition 
of archaeal ribosomes to be derived (69). Sixty-eight 
r-proteins (of a known total of 102) are represented in 
Archaea; 28 belong to the SSU and 40 to the LSU. 
Thirty four (15 in the SSU and 19 in the LSU) are uni- 
versal proteins present in the ribosomes of all three 
domains of life. Another 33 r-proteins (13 SSU and 
20 LSU) are shared by Archaea and Eucarya but not 
by Bacteria. No r-proteins are shared by Bacteria and 
Archaea but absent in Eucarya. Only one r-protein 
(LXa in the LSU) is unique to the Archaea (69). 

The total number of r-proteins in Bacteria, Ar- 
chaea, and Eucarya is 57, 68, and 78, respectively. 
The ribosomal protein complement of archaeal ribo- 
somes is entirely represented in the Eucarya. The sim- 
ilarity between the protein composition of archaeal 
and eucaryal ribosomes is further illustrated by the 
fact that, among the universal r-proteins, the archaeal 
and eucaryal homologs have the highest level of se- 
quence identity. On the other hand, the substantial di- 
vergence of the bacterial ribosome from the archaeal/ 
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) 16S RNA. The regions where the structures differ are indicated by arrows. The black lines represent identified tertiary in- 


Figure 1. Comparison of 16S RNA secondary structure in archaea and bacteria. Secondary structure models are shown for one archaeal 16S rRNA (S. solfa- 
teractions between nucleotides. Data taken from the http://www.rna.icmb.utexas.edu/. The details of the database are described in reference 17. 


taricus) and one bacterial (E. coli 
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eucaryal type is highlighted by the fact that 23 r-pro- 
teins are exclusive to bacteria (69). 

The protein composition of archaeal ribosomes 
is not the same in all archaea and reflects the phy- 
logeny of the host species (69). The branching order 
of several archaeal genera in the “universal tree of 
life” is shown in Fig. 2. On the whole, ribosomes of 
the early-branching Crenarchaeota (e.g., Sulfolobus) 
appear to have more ribosomal proteins than the 
Euryarchaeota (e.g., Thermoplasma). At one ex- 
treme, the crenarchaeon A. pernix is endowed with 
the full complement of 68 archaeal r-proteins, while 
the ribosomes of the late-branching halobacteria and 
thermoplasmales lack as many as 10 proteins, ap- 
proaching the number found in bacteria (Table 2). All 
of the missing proteins are those that are shared by 
the Archaea and Eucarya. Moreover, protein loss 
seems to have evolved gradually in the 50S subunit, 
while being stepwise in the 30S particle, the 30S hav- 
ing either 28 r-proteins in all Crenarchaeota or 25 in 
all Euryarchaeota (Table 2). 

This progressive reduction of the number of ribo- 
somal proteins is not observed in bacterial and eucaryal 
ribosomes, whose composition remains essentially the 
same in most species, with only some exceptions in 
parasitic ones such as Giardia lamblia (69). 
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The fact that the archaeal species with “heavy” 
ribosomes (larger numbers of r-proteins) are deep- 
branching suggests that they were also present in the 
ancestor of the Archaea and Eucarya. The eucaryal ri- 
bosomes may have gained more proteins in the course 
of evolution, while the bacteria appeared to undergo 
a marked reduction in their complement of ribosomal 
proteins. It may be inferred that the 34 universal pro- 
teins were present in the LUCA. In support of this, a 
recent study of the composition of a “minimal” or 
“ancestral” ribosome reached the conclusion that it 
should contain 35 to 40 proteins (81). 


Ribosome Architecture 


The general architecture of the ribosome is con- 
served in all cells. However, both the small and the 
large ribosomal subunits have domain-specific RNA 
and protein composition. As noted above (“Histori- 
cal overview”), the shape of eucaryal ribosomes is 
somewhat different from that of bacterial ribosomes 
(3). In recent years, spectacular progress has been 
made using X-ray crystallography and cryo-electron 
microscopy to analyze ribosome architecture. It is some- 
what paradoxical that, although most data on ribo- 
some function has been obtained with Escherichia coli 
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Figure 2. The phylogenetic tree of life. Unrooted tree showing the branching of the principal species in the three domains of life. 


Adapted from Lecompte et al. (69). 
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Table 2. Differential protein composition of archaeal ribosomes 


Missing SSU Missing LSU 
Organism r-proteins r-proteins® 
(of 28 total) (of 39 total) 
Crenarchaeota 
A. pernix -b 
P. aerophylum - L35ae 
S. solfataricus - L35ae; L38e 
S. tokodaii - =e 
Euryarchaeota 
P. furiosus S25e; S26e; S30e L13e; L38e 
P. horikoshii = = 
P. abyssi = = 
T. kodakaraensis = = 
M. kandleri = Li3e; L38e 


M. jannaschii = 
M. thermauto- 


Li3e; L38e; L3S5ae 


trophicus = = 
M. mazei = L13e; L14e; L34e; 
L35ae; L38e 
A. fulgidus = = 
T. acidophilum = L13e; L14e; L30e; 
L34e; L3Sae; 
L38e; Lxa 
T. volcanii = = 
Halobacterium 
sp. NRC-1 = = 


H. marismortui = = 


Protein L41 is not included in the table because its presence/absence is un- 
certain in several species. 

b—, no proteins missing. 

=, same missing proteins as above. 


ribosomes, the most detailed structural studies have 
been performed on the ribosomes of other bacteria 
and archaea. E. coli ribosomes have proven to be par- 
ticularly resistant to good crystallization. The best 
crystallographic analyses of ribosome structure have 
been performed using components from extremo- 
philic bacteria (e.g., Thermus thermophilus and Dei- 
nococcus radiodurans) (16, 42) or archaea. 
Currently, the best model of the large ribosomal 
subunit has been obtained from crystals of the Halo- 
arcula marismortui LSU (6). H. marismortui has a 
relatively small ribosome (having a number of r-pro- 
teins similar to bacteria). Therefore, the H. maris- 
mortui SOS structure is not a good model for studying 
ribosomes that are composed of the full array of ar- 
chaeal r-proteins. For example, the archaeal-specific 
protein, LXa, is not present in H. marismortui (or 
other halophiles). Despite this limitation, some inter- 
esting observations can be made; one relates to pro- 
tein L7ae, which is present in the Eucarya and the 
Archaea but not present in Bacteria. L7ae was origi- 
nally identified as a ribosomal protein. However, it 
was subsequently found to function also as a compo- 


nent of the machinery for rRNA posttranscriptional 
modification (87). As described in more detail below 
(Ribosome biosynthesis), L7ae has homology with 
the eucaryal protein snu13p, which is the RNA-bind- 
ing element of the snoRNPs involved in posttran- 
scriptional modification of the rRNA transcripts (60). 
L7ae is clearly identifiable in the three-dimensional 
structure of the H. marismortui 50S subunit, showing 
that it is a bona fide ribosomal protein. However, 
consistent with its multifunctional character, it is lo- 
cated at the periphery of the subunit and is one of the 
few r-proteins that makes contact with only one 
rRNA domain. Its function in the ribosome is unclear 
(6). Future studies examining the structures of whole 
ribosomal subunits from other archaea will help to 
identify the architectural features that may be unique 
to archaeal ribosomes. 


Ribosome biosynthesis 


The only information that is available on the 
mechanism of ribosome biosynthesis in archaea re- 
lates to the processing and maturation of the rRNA 
transcripts. With the exception of the thermoplas- 
males, the LSU and SSU rRNA genes are located ad- 
jacent to one another in archaeal genomes and are 
cotranscribed as a single large precursor molecule, 
similar to bacteria and eucarya. 

An important step of rRNA maturation that dif- 
ferentiates the three domains of life is the posttran- 
scriptional chemical modification of certain nu- 
cleotides. In bacteria, relatively few types of rRNA 
modifications occur, and these are mainly limited to 
the bases: the most common are A and G methyla- 
tions and a particular kind of isomerization that 
transforms uracil into pseudo-uracil. Among the Ar- 
chaea, the Euryarchaeota display a pattern of rRNA 
posttranscriptional modification that is similar to 
Bacteria. In contrast, the rRNAs of the Crenar- 
chaeota exhibit a diverse and abundant pattern of 
modifications similar to Eucarya. In addition to base 
methylation and pseudo-uridilations, numerous ex- 
amples of methylation on the 2'-OH group of ribose 
have been observed (79). 

Remarkably, the Crenarchaeota and Eucarya 
employ the same elaborate mechanism for inserting 
ribose methylations in their rRNAs. In both types of 
cells, the rRNA-methylating enzymes are ribonucleo- 
protein (RNP) complexes containing small RNA 
molecules (5, 51, 52, 113). In these complexes, the 
proteins provide the catalytic activity, while the RNA 
molecule acts as the guide for target recognition. The 
small RNAs found in the RNPs involved in 2’-OH 
ribose methylation are called the C/D box sRNAs 
(snoRNAs in the Eucarya). All C/D box sRNAs con- 
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tain two conserved sequences, the C box (RUGAUGA, 
where R stands for purine) and the D box (CUGA) lo- 
cated near the 5’ and 3’ end, respectively. In all ar- 
chaeal and some eucaryal C/D box snoRNAs, a sec- 
ond set of conserved sequences, C’ and D’, is located 
in the central region of the molecule between the C 
and D boxes (55, 86). The C/D box snoRNAs guide 
site-specific modification by base-pairing with the 
pre-rRNA at the target sites. Specifically, C/D box 
snoRNAs contain one, or sometimes two, sequences 
of 10 to 21 nt located immediately upstream of D and 
D’ boxes and complementary to the target RNA. 
Methylation takes place on the ribose of the nu- 
cleotide located exactly five bp upstream of box D or 
D’ (19, 86). 

In eucarya, four core proteins associate with all 
C/D box snoRNAs: NOP56, NOPS8, fibrillarin, and a 
15.5-KDa protein (termed Snu13p in yeast). Fibril- 
larin probably acts as the methylase enzyme (82, 117), 
while the RNA-binding component of the complex is 
the 15.5-kDa protein. All archaeal genomes sequenced 
to date contain genes encoding proteins homologous 
to the components of eucaryal snoRNA complexes: 
fibrillarin, Nop56/58 (a single protein for NOP56 and 
NOPS58), and L7ae. As described above (see “Ribo- 
some architecture”), the latter is also a protein of the 
large ribosomal subunit (6). It has been experimen- 
tally demonstrated that the components of the ar- 
chaeal C/D box sRNAs do associate and form com- 
plexes that carry out site-specific ribose methylation in 
vitro (86, 87, 126, 129). On the basis of these data, it 
has been proposed that the sRNP-based machinery for 
site-specific ribose methylation originated in ther- 
mophilic archaea (probably because of the need to sta- 
bilize the rRNAs against thermal denaturation) and 
was subsequently inherited by the Eucarya. 

During ribosome synthesis, rRNA processing 
and modification proceeds in parallel with the assem- 
bly of the ribosomal proteins on the rRNA in a highly 
integrated and regulated fashion. In vivo studies of ri- 
bosome assembly have not been performed in ar- 
chaea. However, in vitro studies have been performed 
for almost two decades. The ribosomal subunits of 
the crenarchaeon, Sulfolobus solfataricus (76), and of 
the euryarchaeon, Haloferax mediterranei (101, 102), 
can be functionally reconstituted in vitro from indi- 
vidual RNA and protein components. Assembly of 
thermophilic and halophilic ribosomes in vitro requires 
high temperature and high salt, respectively, condi- 
tions that reflect the organism’s natural environment. 
Aside from these basic requirements, assembly is 
spontaneous, albeit rather slow. In this respect, both 
the “large” (S. solfataricus) and “small” (H. mediter- 
ranei) types of archaeal ribosomes resemble their bac- 
terial counterparts, which also spontaneously assem- 


ble in vitro (83). In contrast, functional reconstitution 
of eucaryal ribosomes in vitro has not been achieved, 
despite the efforts of many investigators for more 
than three decades. Even though the archaeal (and 
bacterial) ribosomes can spontaneously assemble in 
vitro from the separate rRNA and protein compo- 
nents, this does not exclude the need for chaperones 
or other accessory factors in vivo, in particular, to en- 
hance the rate of ribosome biosynthesis and to regu- 
late ribosome assembly. In support of this view, in 
bacteria the chaperone protein DnaK seems to be in- 
volved in ribosome biogenesis (1). Similar to many as- 
pects of archaeal protein synthesis, ribosome biogen- 
esis requires a great deal more study. 


Archaeal mRNAs 


Few experimental studies have been performed 
on the structure of archaeal gene transcripts. How- 
ever, it is clear that the translation of polycistronic 
mRNAs into multiple polypeptides is common in ar- 
chaea (3). Moreover, it has been known for many 
years that archaeal mRNA, similar to bacterial mRNA, 
contains ribosome-binding sites similar to the Shine- 
Dalgarno (SD) sequences in bacteria; accordingly, ar- 
chaeal SSU ribosomal RNAs have corresponding anti- 
SD sequences (28). 

Recent computational analyses of genome se- 
quences that analyzed the position of transcription 
start sites, initiation codons, and potential ribosome- 
binding motifs (SD sequences) generated interesting 
findings about unique aspects of mRNA structure in 
Archaea. The putative structure of archaeal transcripts 
have been identified from the genome sequences of 
the hyperthermophilic archaea, S. solfataricus and 
P. aerophilum (105, 109). Based on the predicted lo- 
cation of promoters and the start sites of transcrip- 
tion, a large proportion of mRNAs from these two 
organisms appear to lack a 5’-untranslated region 
(5'-UTR; leaderless mRNAs). This has been experi- 
mentally verified for P. aerophylum (109). The ab- 
sence of a 5'-UTR means that a SD motif is unavail- 
able for ribosome binding. However, the downstream 
genes of leaderless polycistronic transcripts usually 
contain identifiable SD motifs upstream of their initi- 
ation codons. On the basis of these data, it was pro- 
posed that two distinct mechanisms for translation 
initiation exist in Archaea: one of a bacterial type, 
based on SD motifs and operating mainly on the in- 
ternal cistrons of polycistronic mRNAs, and another 
of an unknown nature employed for initiating trans- 
lation of leaderless cistrons (12, 114). 

A more recent survey of archaeal genomes pre- 
dicted the presence of two distinct types of archaeal 
transcripts (115). Group A genomes include several 


182 LONDEI 


(but nor all) Crenarchaeota, and the Euryarchaeota, 
thermoplasmales, halobacteria, and Nanoarchaeum 
equitans. These are predicted to produce a high pro- 
portion (~50%) of leaderless transcripts. In some of 
these genomes the genes located internally in poly- 
cistronic transcripts are preceded by clearly identifi- 
able SD motifs. However, in other group A genomes, 
such as those of N. equitans and P. aerophilum, the 
internal genes in cistrons often lack identifiable SD- 
like sequences. 

Group B genomes are predicted to mainly pro- 
duce transcripts which possess SD motifs ahead of the 
initiation codons of both the first and the internal 
genes in operons (or of genes in monocistronic tran- 
scripts), and only produce a few leaderless transcripts. 
Group B genomes include a diverse array of species 
that includes methanogens, as well as the Crenar- 
chaeota, A. pernix and Hyperthermus butylicus, and 
the pyrococcales. 

Although this study confirmed that leaderless 
transcripts appeared to be produced in several ar- 
chaea, it also showed the marked heterogeneity that 
exists within the Archaea. An analysis of the phylo- 
genetic distribution of group A and group B genomes 
indicates that leaderless transcripts are primarily 
found in late-branching organisms (see Fig. 2), sug- 
gesting a more recent evolutionary origin. However, 
this conclusion contrasts with studies showing that 
leaderless mRNAs are universally translatable by all 
kinds of ribosomes, a finding that argues in favor of 
their primitive status (36). Further studies are neces- 
sary to shed more light on this very interesting aspect 
of archaeal translation. 


GENOME ORGANIZATION OF 
TRANSLATION COMPONENTS 


The availability of several complete genome se- 
quences of diverse archaeal species allows a mean- 
ingful comparison of the distribution and organiza- 
tion of the various RNA- and protein-encoding genes 
whose products constitute the translation apparatus. 
These consist of the numerous genes that encode the 
components of the ribosome: the three ribosomal 
RNAs (16S, 23S, and 5S) and the 68 possible r-pro- 
teins. In addition, there are genes encoding tRNAs 
and the accessory proteins that function in translation 
initiation, elongation, and termination. 

The size and number of rRNA genes found in 21 
complete archaeal genomes are summarized in Table 
1. Most species have a single copy of the LSU and 
SSU RNA genes. The exceptions are exclusively among 
the halophiles and the methanogens, most of which 
have two or three LSU and SSU rRNA genes. Some 


of the late-branching euryarchaeota have mutiple SSU 
and LSU rRNA genes, although members of the ther- 
moplasmales, Thermoplasma acidophilum and Ther- 
moplasma volcanii, have single copies. All Crenar- 
chaeota, the thermococcales and the archeoglobales 
have single SSU and LSU genes. 

There is a greater tendency toward redundancy 
of the 5S RNA genes. Greater than half of the se- 
quenced genomes contain two to four 5S rRNA genes 
(Table 1). Remarkably, the A. pernix genome seems 
to have an eucarya-like arrangement of rRNA genes, 
since it includes a 5.88 RNA gene located about 300 
bp away from the 23S RNA gene and possibly co- 
transcribed with it. 

The LSU and SSU RNA genes are located close 
to one another in most archaeal species, and the avail- 
able experimental data (77) indicate that they tend 
to be cotranscribed, similar to bacteria and eucarya. 
The exceptions are the thermoplasmales and N. eq- 
uitans, whose LSU and SSU rRNA genes are located 
in distinct regions of the genome and are transcribed 
independently. In contrast, the 5S RNA genes are 
generally unlinked from the LSU and SSU RNA genes, 
except in halophiles and most methanogens. In the 
Methanococcus maripaludis genome, three 5s RNA 
genes are arranged in three individual LSU-SSU rRNA 
gene clusters, and the fourth has an independent lo- 
cation, whereas in Methanopyrus kandleri the single 
5S RNA gene is unlinked from the large rRNA genes. 

The organization of r-protein genes in archaeal 
genomes is interesting. In bacteria, about half of the r- 
protein genes are clustered in the two large operons 
(spectinomycin [spc] and $10), and their composition 
is conserved in most species (23). In the Archaea, over 
one-third of the r-protein genes are included in a few 
large clusters that closely resemble the bacterial spc 
and S10 operons in the type and order of genes. The 
r-protein genes that are present in these clusters are 
particularly interesting as most of them are conserved 
in all three domains of life. 

Figure 3 shows the organization of the $10-like 
and spc-like r-protein gene clusters in 19 complete ar- 
chaeal genomes. These clusters represent 22 of the 68 
r-protein genes. The organization of the same genes in 
E. coli is shown for comparison in the last row of Fig. 
3. Gene order tends to be conserved in the Archaea 
and resembles that of the Bacteria. No experimental 
information is available about the transcription pat- 
terns of these clusters, making it difficult to know to 
what extent they are organized into functional oper- 
ons. Close clustering (genes located less than 50 bp 
apart) may be indicative of an operon structure; if so, 
the pyrococci appear to have only two large operons 
of r-proteins (Fig. 3). In contrast, the genes are arranged 
in smaller, operon-like clusters in other archaea, in- 
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Figure 3. (See the separate color insert for the color version of this illustration.) Organization of the main ribosomal protein gene clusters in archaeal genomes. 
SSO, Sulfolobus solfataricus; STO, Sulfolobus tokodaii; AFU, Archaeoglobus fulgidus; APE, Aeropyrum pernix; PFU, Pyrococcus furiosus; PHO, Pyrococ- 
cus horikoshii; PAB, Pyrococcus abyssi; TKO, Thermococcus kodakaraensis; PAE, Pyrobaculum aerophylum; MKA, Methanopyrus kandleri; MMA, 
Methanosarcina mazei; MAC, Methanosarcina acetivorans; MTH, Methanothermobacter thermautotrophicus; MJA, Methanococcus jannaschii; MMP, 
Methanococcus maripaludis; HMA, Haloarcula marismortui; H-sp, Halobacterium sp. NRC1; TAC, Thermoplasma acidophilum; TVO, Thermoplasma vol- 
canii. The last line (ECO) shows for comparison the organization of the same genes in E. coli that is also present in most bacteria. Genes that are within 50 
bp of each other, and may therefore be cotranscribed, are indicated in the same color. Domain-specific genes are underlined. 
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dicating that they have smaller functional units. The 
distinct functional organization of these r-protein 
genes between different members of the Archaea is in- 
dicative of extensive gene rearrangements. Moreover, 
while in most species such units tend to remain lo- 
cated in the same region of the genome, in a few cases 
they are located far apart from one another. This is 
the case for M. kandleri, in which the r-proteins form 
three hypothetical operons (Fig. 3) separated by 
about half a million base pairs from one another. For 
P. aerophilum, some genes have been separated from 
the conserved clusters and are located elsewhere in 
the genome. For example, the L6 gene is missing from 
the cluster that commences with $17 and is found, on 
its own, elsewhere in the genome. Similarly, the L22 
gene is separated from its neighbor $3 by a few hun- 
dred base pairs (Fig. 2). 

On the whole, the similarity of the clustering or- 
der of the r-protein genes in archaea and bacteria (Fig. 
3) strongly suggests that it is an ancestral feature that 
was present in the LUCA and predates the radiation 
of the three primary domains. The alternative hy- 
pothesis is that similar gene clustering is due to con- 
vergent evolution. This could be explained by positive 
selection due to a functional advantage that occurred 
independently in the Archaea and Bacteria. However, 
it is not apparent what this advantage might have 
been as the gene clusters shown in Fig. 3 demonstrate 
that they can exist ina “broken” form in extant bac- 
teria and archaea. 

In some archaea, the r-protein genes have a 
greater tendency to group together than in others. 
Moreover, in some species (e.g., Methanosarcina 
mazei), the clustered genes tend to be very close to each 
other, lacking spacer tracts between them and often 
even overlapping by a few base pairs, while in other 
species (e.g., Methanocaldococcus jannaschii), they 
are generally separated by at least a few base pairs. No 


correlation is apparent between these patterns of gene 
organization and the phylogenetic position of the ar- 
chaeal species in which they occur. For example, there 
is a strong clustering of the r-protein genes in late- 
branching species (Thermoplasmales) and early- 
branching organisms (pyrococci and A. pernix), as 
well as broken clusters for early- and late-branching 
organisms, P. aerophylum and H. marismortui, re- 
spectively (see Fig. 2). Research examining the expres- 
sion and regulation of r-protein genes in archaea is re- 
quired before an understanding can be obtained of the 
functional significance, if any, of the differences in 
gene organization among the various species. 

Ribosomal protein genes (other than those shown 
in Fig. 3), in general, are found in individual genomic 
locations or in small groups of a few genes, sometimes 
intermingled with other translational genes such as 
those encoding elongation and initiation factors. Most 
of the r-protein genes that are shared by the Archaea 
and the Eucarya have this kind of organization. In all 
archaea, the one archaeal-specific r-protein gene, LXa, 
is found in a cluster that contains other genes encod- 
ing translational components that are present in the 
Archaea and the Eucarya but not the Bacteria (e.g., 
r-protein L30, the putative initiation factor eIF6, and 
the chaperone-like protein, prefoldin). 

Genes encoding translation factors are more or 
less scattered throughout all archaeal genomes. The 
four genes encoding the universal initiation factors 
YciH/SUI1, IF1/IF1A, IF2/IFSB, and EFP/IF5A (see 
“Translation initiation factors,” below, and Table 3) 
tend to be unlinked from other translational genes 
and are likely to be individually transcribed. The genes 
encoding the a-, B-, and y-subunits of the archaeal/ 
eucaryal initiation factor a/eIF2 occupy separate lo- 
cations in all known archaeal genomes and must 
therefore be transcribed independently. Moreover, the 
a/eIF2-a, -B, and -y genes are seldom clustered with 


Table 3. Translation initiation factors in Archaea 


Factor name Eucarya Bacteria Structure Function in Archaea Function in other domains 
alF1A eIF1A IF1 Solved Unknown Bacteria, stimulates alF2 
Eucarya, assists scanning 
eIF5B IF2 Solved Unknown Bacteria, binds fmet-tRNA; 
Eucarya, subunit joining 
aSUI1 eIF1/SUI1 YciH (some Solved Unknown Bacteria, unknown; Eucarya, 
phyla) fidelity factor 
a/eIF2 (aBy) eIF2 (aBy) - Solved (subunits) Binds met-tRNAi Binds met-tRNAi 
alF6 elF6 - Solved Unknown Inhibits subunit association 
alFSA* eIFSA EFP Solved Unknown Bacteria: stimulates first 
peptide bond 
alF2B (a,B,8)” eIF2B (a,B,y,6,€) - Solved Unknown elF2 recycling 


“Behaves as a specialized elongation factor. 


Involvement in translation initiation uncertain (see text). The nomenclature for the archaeal proteins varies in the literature, and the form chosen for this 


table is based on that used in the author’s laboratory (74). 


CHAPTER 8 • TRANSLATION 185 


other translational genes, with the exception of a/eIF2- 
a gene, which in most genomes is located in the vicin- 
ity of r-proteins $27 and L44. Note that the gene for 
the a/eIF2 B-subunit is clustered and possibly cotran- 
scribed with several putative cell-division genes in 
some genomes. The other archaeal/eucaryal IF, alF6, 
tends to be clustered with a few archaeal/eucaryal 
type r-protein genes and the archaeal specific r-pro- 
tein, LXa.The aIF6 gene cluster contains the gene en- 
coding the putative chaperone protein prefoldin, 
which is predicted to be highly expressed in many ar- 
chaea (46). The closeness of the genes in the aIF6 
cluster indicates they may be arranged in an operon 
and cotranscribed. However, no experimental data 
are available to confirm this. 

The genes for the two translation elongation fac- 
tors EF1A and EF2 are unlinked in all archaeal 
genomes, although in several cases they are clustered 
with other translation genes. Finally, the gene encod- 
ing the putative translation termination factor aRF1 
is in general not clustered with other genes encoding 
components of the protein synthesis apparatus. 


MECHANISM OF TRANSLATION 
General Overview 


The process of protein synthesis develops along 
an orderly series of steps and is conserved in all three 
domains. The first step is initiation, whereby the ri- 
bosomes interact with the translation start region and 
guide the initiator tRNA onto the initiation codon, 
thereby setting the correct reading frame for decod- 
ing. Initiation controls to a large extent the speed 
and efficiency of protein synthesis and has therefore 
evolved to be the target for most mechanisms of 
translational control. Several IFs are required to assist 
initiation. The next stage is elongation, which consists 
of three distinct steps: (i) adaptation, during which 
aminoacyl-tRNAs, in a ternary complex with EF1A 
and GTP, recognize the proper codon located in the 
ribosomal A site; (ii) transpeptidation, during which 
the peptide bond is formed; (iii) translocation, dur- 
ing which the ribosome slides by one codon toward 
the 3’ terminus of the mRNA with the aid of EF2. Fi- 
nally, termination occurs when a stop codon enters 
the A site and is recognized by the proper RF, which 
catalyzes the release of the completed polypeptide 
chain from the ribosome. 

The mechanism of protein synthesis has been 
elucidated in fine detail in bacteria and, to a lesser ex- 
tent, in eucarya. In contrast, studies on the mecha- 
nism of translation in archaea are still in their infancy, 
and very little experimental information is available; 


the current extent of knowledge is summarized in the 
following sections. 


Translation Initiation 


It has been recognized for many years that the 
initiation step of translation has diverged greatly in 
the domains of life. More than three decades of re- 
search have revealed that the mechanism and the cel- 
lular machinery of translation initiation in the Bacte- 
ria and the Eucarya have little similarity. Eucaryal 
ribosomes normally initiate translation by means of 
a “scanning” mechanism, whereby the 40S subunits, 
aided by many protein factors and carrying the spe- 
cific initiator tRNA charged with unmodified methio- 
nine (met-tRNAi), slide along the message until the 
initiation codon is found (58). In contrast, bacterial 
ribosomes bind directly to the mRNAs with the aid of 
the RNA/RNA interaction between the SD sequence 
on the mRNA and the anti-SD sequence on the SSU 
rRNA. To assist the initiation reactions, bacteria have 
a minimal set of protein factors, only three proteins, 
compared with more than a dozen in eucarya (37). 
Most of the initiation factors are domain specific. 
These differences have been attributed to the different 
complexity of eucaryal and prokaryotic cells and to 
the need of the former to evolve more sophisticated 
mechanisms for translational regulation. The Archaea 
have proven to be ideal for investigating the divergent 
evolution of the translation initiation step. As a re- 
sult, considerable effort has recently been dedicated 
to the understanding of the features of translation ini- 
tiation in archaea. 


Ribosome/mRNA interaction 


Currently, the best understood aspect of transla- 
tion initiation in archaea is the mechanism of ribo- 
some/mRNA interaction. Studies carried out in the 
1980s identified the mechanism to be very similar to 
that employed by the Bacteria, i.e., based on the use 
of the SD/anti-SD interaction to promote ribosomal 
recognition of the translation initiation sites (43, 94). 
However, it was recognized that some archaeal mRNAs 
failed to fit this simple model, i.e., the leaderless mRNAs 
that had been detected in halophiles (13). At the time 
of this discovery, the leaderless mRNAs were re- 
garded as an oddity that did not challenge the essence 
of the SD-mediated scheme of mRNA/ribosome in- 
teraction. It was argued that leaderless mRNAs were 
also found, although infrequently, in bacteria and 
bacteriophages, and the problem of their interaction 
with the ribosomes had apparently been settled by a 
study carried out on the transcript of the lambda re- 
pressor gene (107). The leaderless mRNA contained a 
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special sequence downstream of the initiation codon, 
termed the “downstream box,” which mediated 
mRNA/ribosome interaction by pairing with a tract 
of the 16S rRNA located at nt 1469 to 1483 (num- 
bering relative to the E. coli 16S rRNA) (33, 107). 
Leaderless mRNAs were generally thought to possess 
either a “downstream box” or a “cryptic” SD se- 
quence located a few base pairs downstream of the 
initiation codon that were recognized by the SSU with 
the customary RNA/RNA interaction mechanism 
(13). Subsequent studies challenged these conclu- 
sions, however. One concern was that recognition of 
leaderless mRNAs was widespread in some archaea 
(105, 109). Another concern was that the putative 
“anti-downstream box sequence” is not conserved in 
archaeal SSU RNAs. The third issue related to the 
demonstration that, even in bacteria, disruption of 
the putative antidownstream boxes did not inhibit the 
translation of the leaderless mRNAs (64, 85). As a re- 
sult, the view emerged that the Archaea employ two 
distinct mechanisms of mRNA/ribosome interaction: 
one that was poorly understood and operated on the 
leaderless mRNAs, and the other based on the 
SD/anti-SD interaction that was specific for the lead- 
ered messages with SD motifs (114). 

The first set of experimental data supporting the 
existence of two distinct translation mechanisms in 
Archaea was obtained using S. solfataricus as the 
model organism. With the aid of a cell-free transla- 
tion system, it was demonstrated that, when present, 
SD motifs are essential for translational initiation 
(25). The disruption of SD motifs by site-directed mu- 
tagenesis was found to completely inhibit in vitro 
translation, a result that was more pronounced than 
that found in bacteria where disruption of SD se- 
quences usually reduces, but does not abolish, protein 
production. However, other archaea, such as Halo- 
bacterium salinarum, may have a less stringent re- 
quirement for SD motifs, as their disruption leads to 
a reduction of translational efficiency but not to a to- 
tal inhibition of protein synthesis (103). 

A most remarkable finding was that the in vitro 
translation of the mRNAs whose SD motifs had 
been disrupted could be rescued by entirely deleting 
the 5’-UTR, i.e., by rendering the mRNA leaderless 
(25). These results indicate that an individual mRNA 
can be translated by two different routes, depending 
on whether it possesses a 5'-UTR endowed with a 
SD motif. 

The mechanism for leaderless mRNA/ribosome 
interaction is poorly understood. Some experimental 
information was obtained from in vitro studies car- 
ried out with purified translational components of 
S. solfataricus (12). In the presence of SD motifs, 
S. solfataricus SSU interacted directly and strongly 


with leadered mRNAs in the absence of any other 
translation components. Unlike their bacterial coun- 
terparts, the binary 30S/mRNA complexes that formed 
were very stable. In contrast, leaderless mRNAs were 
unable to form binary complexes with 30S subunits. 
A 30S subunit/leaderless mRNA interaction could be 
detected only in the presence of met-tRNAi (12), sup- 
porting the idea that codon-anticodon pairing was 
required for initiation site recognition, as previously 
observed for leaderless mRNA translation in E. coli 
(36). This mechanism has some similarity to eucarya, 
where ribosomes identify the translation start site by 
also requiring the presence of met-tRNAi in the P site 
for the 40S subunit to land on the correct AUG initi- 
ation codon. 

Gene function does not correlate with mRNA 
being “leaderless” or “leadered.” However, the (still 
scarce) experimental data available allow the general 
conclusion that leaderless mRNAs are less efficiently 
translated than leadered mRNA (e.g., reference 25). 
This contention is consistent with a genomic study 
that showed that highly expressed genes in archaea 
tended to have SD sequences upstream of their initi- 
ation codons (48). 

An important question raised by the existence of 
two initiation mechanisms in archaea is whether they 
evolved at the same time, or whether one mechanism 
preceded the other in the course of evolution. Several 
observations argue that “leaderless” initiation is the 
ancestral mechanism. The most compelling evidence 
is that leaderless mRNAs are universally translatable 
(at least in vitro) by archaeal, bacterial, and eucaryal 
ribosomes (36). Since “normal” eucaryal mRNAs are 
poorly, if at all, translated in bacterial systems (and 
vice versa), this is a remarkable fact that argues for a 
common conserved mechanism underlying leaderless 
translation. The recent observation that most mRNAs 
are leaderless in the protozoan G. lamblia (73) is in- 
teresting. If G. lamblia is a primitive eucarya, this 
provides support for leaderless mRNAs being ances- 
tral. Giardia occupy a deep branch in the eucaryal 
evolutionary tree (see Fig. 2). However, it is still un- 
clear whether this is a true indication of being primi- 
tive or is an artifact of evolutionary analysis due to an 
abnormally fast evolutionary rate, a frequent occur- 
rence in parasitic organisms such as G. lamblia. 

Additional evidence supporting the ancestral na- 
ture of leaderless mRNAs is that, at least in bacteria, 
their translation seems to have no stringent require- 
ment for initiation factors, especially if performed by 
70S ribosomes (116). As a small set of IFs are com- 
mon to all three domains (see “Translation initiation 
factors,” below), the minimal number of accessory 
factors required for translation initiation may be in- 
dicative of what was present in primitive cells. 
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There are also data arguing against leaderless 
mRNAs being the primitive form. First, leaderless 
mRNAs are especially abundant in late-branching ar- 
chaeal species, while early-branching species tend to 
have leadered mRNAs with SD motifs (115). Based 
on these data, the latter would be predicted to be the 
“ancestral” mRNA structure. If so, the prevalence of 
leaderless mRNAs in later-evolved, and especially in 
extremely thermophilic, archaeal species may reflect a 
physiological requirement that is presently not under- 
stood. Second, it has been argued that the poly- 
cistronic arrangement of genes, and the presence of 
SD motifs, is likely to predate the branching of the 
three domains (74). The r-protein genes are often 
clustered in a similar order (and sometimes cotran- 
scribed) in both bacteria and archaea (Fig. 3). It is 
very unlikely, albeit not impossible, that this has 
arisen from convergent evolution. In early-branching 
archaea, such as A. pernix and H. butylicus (Fig. 2), 
most genes with SD motifs use AUG, GUG, and UUG 
as initiation codons in roughly the same proportion, 
while in most other species AUG is by far the preva- 
lent initiation signal (115). This suggests that a 
“primitive” function of SD motifs may have been to 
ensure correct ribosome positioning on the transla- 
tion initiation site, in a way not strictly dependent on 
the presence of an optimal initiation codon. In sup- 
port of this view, in S. solfataricus the 30S ribosomes 
can form stable binary complexes with mRNAs en- 
dowed with SD motifs, even if these are not followed 
by a proper initiation codon (12). 


Translation Initiation Factors 


It is well known from a wealth of detailed ge- 
netic and biochemical studies that bacteria and eu- 
carya greatly differ in the type and usage of the pro- 
teins that assist and modulate translation initiation. 
Only three IFs exist in bacteria. The principal factor, 
IF2, is an RNA-binding G-protein of ~90 kDa that 
performs the essential task of promoting the correct 
binding of the initiator tRNA (f-met-tRNA) to the ri- 
bosomal P site. IF1 and IF3 assist initiation by hin- 
dering premature subunit association, and IF3 assists 
by discouraging recognition of nonoptimal initiation 
codons (37). 

In contrast, Eucarya require a much more elab- 
orate complement of IFs (91). The cap-binding factor, 
eIF4F (absent in bacteria), is necessary to unwind the 
mRNA to allow its interaction with the ribosome. To 
successfully perform “scanning” for the initiator 
AUG codon, the eucaryal 40S subunits must carry the 
initiator tRNA (met-tRNAi) and the proteins eIF1, 
eIF1A, and eIF3 (92). eIF1 and eIF1A are both re- 
quired for the correct identification of the start 


codon, while one of the roles of eIF3 is to connect the 
ribosome with the scanning factor eIF4F. Met-tRNAi 
binding to the 40S subunits is promoted by the 
G-protein eIF2, a heterotrimeric complex, whose sub- 
units are not homologous to the bacterial factor, IF2 
(50, 62). Adaptation of met-tRNAi in the P site is ac- 
companied by the hydrolysis of the eIF2-bound GTP, 
whereupon the factor dissociates from the ribosome. 
However, eIF2 does not have spontaneous GTPase 
activity and requires a GTPase activator factor, eIFS, 
to trigger GTP hydrolysis. Moreover, the reactivation 
of eIF2-GDP has an obligatory requirement for a 
GTP/GDP exchange factor, the pentameric protein, 
eIF5B (50). After the establishment of the codon-anti- 
codon interaction, eIF5B (a G-protein) stimulates 
subunit joining, leading to the formation of the 
monomeric ribosome 80S (93). Finally, eIFSA is re- 
quired to trigger the formation of the first peptide 
bond (80). 

The availability of complete genome sequences 
from several archaea has made it possible to identify 
homologs of translation initiation factors from bacte- 
ria and eucarya and thereby formulate hypotheses 
about their role in archaea. The original surveys 
proved surprising because, although archaea were ex- 
pected to have a restricted number of initiation fac- 
tors, similar to bacteria, archaeal genomes contained 
a host of genes homologous to eucaryal factors, the 
only exception being the lack of those involved in cap 
recognition (11, 30) (Table 3). 

The evolution of IFs is similar to ribosomal pro- 
teins in several ways: (i) the primary sequences of the 
putative archaeal initiation factors are always more 
similar to eucaryal proteins; (ii) a (small) set of factors 
is shared by all three domains of life; (iii) some IFs are 
shared by the Archaea and the Eucarya and not by 
the Bacteria; (iv) no IF is shared by the Archaea and 
the Bacteria and absent in the Eucarya. In addition, 
the Eucarya and the Bacteria, but not the Archaea, 
possess unique factors not present in the other do- 
mains. This latter observation may not be surprising 
as the archaeal IFs have been identified by sequence 
comparisons with IFs from the other two domains. 
A detailed biochemical and genetic analysis of ar- 
chaeal translational initiation may well lead to the 
discovery of IFs unique to the third domain. 

In contrast to the evolutionary considerations 
that continue to highlight the similarities between ar- 
chaeal and eucaryal translation, comparatively little is 
known about the function of the putative IFs in ar- 
chaea. However, X-ray or NMR structures are avail- 
able for all of the putative archaeal IFs listed in Table 
3; these have been obtained with a view to gaining in- 
sight into the function of the eucaryal homologs, 
rather than the function in archaea per se. 
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The one archaeal IF for which a function has 
been experimentally defined is the trimeric protein 
homologous to the eucaryal factor eIF2. In the litera- 
ture, this factor is referred to as aIF2 or a/eIF2; here, 
the second designation is employed, while using aIF2 
for the monomeric protein homologous to bacterial 
IF2 and eucaryal eIFSB. 

Eucaryal IF2 is an important translation initia- 
tion factor, as it specifically interacts with the initiator 
tRNA (met-tRNAi) and carries it to the 40S riboso- 
mal subunit (50). In bacteria, the same essential func- 
tion is carried out by a monomeric protein, also called 
IF2 but unrelated in sequence to any of the eIF2 sub- 
units (37). Until recently, the prevalent rationale for 
the divergence of the tRNAi-binding factors in bac- 
teria and eucarya was that the latter had to evolve a 
more complex protein to achieve a more sophisti- 
cated regulation of translational initiation. In fact, 
elF2 is a main target for eucaryal translational regu- 
lation. However, the fact that the archaea resemble 
the eucarya in having both eIF2-like and IF2-like fac- 
tors shows that cellular complexity has probably 
nothing to do with the usage of these translation ini- 
tiation factors. 

Similar to eIF2, a/eIF2 is composed of three sub- 
units that associate to form a heterotrimeric complex 
(90, 123, 124). This complex is smaller than its eu- 
caryal counterpart, due to the much reduced length of 
its B-subunit (15 kDa instead of ~50 kDa). In all ar- 
chaea, the y-subunit is the largest protein (~45 kDa) 
followed by the a-subunit (~30 kDa). The archaeal 
B-polypeptide is smaller than its eucaryal counterpart 
due to the lack of the domains that, in the eucaryal 
subunit, interact with the guanine nucleotide ex- 
change factor, eIF2B, and with the GTPase activator, 
eIF5. This is consistent with the observation that all 
archaea lack a homolog of eIF5, as well as three of 
the five subunits of eucaryal eIF2B. Archaeal genomes 
do include homologs of the a-, B-, and 8-subunits of 
eIF2B but lack counterparts of the y- and e-subunits 
that catalyze guanine nucleotide exchange on elF2. 
Therefore, it is probable that the archaeal homologs 
of the eIF2B a-, B-, and 8-proteins have a function 
unrelated to guanine nucleotide exchange (62). 

X-ray or NMR structures are available for all 
three subunits of a/eIF2. The largest subunit, y, has a 
striking resemblance to the EF-1A (EF-Tu in bacte- 
ria) (97, 104), consistent with the fact that it contains 
the guanine nucleotide-binding domain and is princi- 
pally involved in the interaction with met-tRNAi. The 
small B-subunit contains a zinc-finger motif (39). The 
a-subunit is composed of three distinct domains, two 
of which have RNA-binding properties (124). 

The function of a/eIF2 from Pyrococcus abyssi 
(123) and S. solfataricus (90) has been explored by 


in vitro biochemical assays using the factor reconsti- 
tuted from the cloned recombinant subunits. These 
experiments revealed that, similar to its eucaryal 
counterpart, a/eIF2 specifically binds met-tRNAi and 
carries it to the ribosome. However, several features 
functionally differentiate the archaeal and the eu- 
caryal proteins. One feature relates to the nature of 
the tRNA-binding site. For a/eIF2, an ay-dimer is 
necessary and sufficient to achieve a stable interaction 
with met-tRNAi, while met-tRNAi binding seems to 
involve mainly the y- and B-subunits of eIF2 (29, 32). 
The eucaryal a-polypeptide participates mainly in the 
regulation of eIF2 activity; its phosphorylation, trig- 
gered by various metabolic cues, inhibits the eIF2B- 
catalyzed exchange of GDP with GTP, thereby block- 
ing the protein in its inactive form (24, 121). Another 
functional difference is that a/eIF2 has a similar affin- 
ity for GDP and GTP and therefore does not require a 
guanine nucleotide exchange factor to be reactivated 
(90). This finding is consistent with the lack of a com- 
plete homolog of eIF2B in archaeal genomes (62). 
Thus, eucaryal-type functional regulation of a/eIF2 
(phosphorylation of the a-subunit and inhibition of 
guanine nucleotide exchange) should not exist in ar- 
chaea. However, it has recently been reported that the 
a-subunit of Pyrococcus horikoshii alelF2 is phos- 
phorylated by a specific protein kinase (111), al- 
though no data are yet available on the possible func- 
tional significance of this remarkable finding. 

Another difference between eIF2 and a/elF2 is 
the probable lack of a GTPase activator protein for 
the latter. GTP hydrolysis of eucaryal eIF2-phosphate 
is triggered by the helper factor eIF5, and no recog- 
nizable homolog is present in the genome sequences 
of archaea. It is therefore likely that a/eIF2 has an in- 
trinsic, ribosome-triggered GTPase activity, although 
this has not yet been demonstrated experimentally. 
Alternatively, GTP hydrolysis of a/eIF2 may be facil- 
itated by an unidentified GTPase activator. 

The function of the remaining IFs listed in Table 
3 has not been determined, although functions have 
been speculated for some of them. A particularly in- 
teresting protein is alF2 (aIF5B), which, with aIF1A 
(homolog to IF1in bacteria and eIF1A in eucarya), is 
one of the two IFs present in all domains (63). Prior 
to this discovery (63), IF2 was thought to be strictly 
a bacterial protein, whose function in eucarya (spe- 
cific recognition of the initiator tRNA) was fulfilled 
by an unrelated and more complex factor, eIF2. The 
function of the eucaryal IF2 homolog, eIF5B, has re- 
cently been clarified. Although nonessential, the fac- 
tor is important to promote the joining of the ribo- 
somal subunits after the 40S particle has correctly 
recognized the initiation codon (93). Similar to IF2, 
eIFSB has an intrinsic ribosome-dependent GTPase 
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activity; GTP hydrolysis accompanies the ejection of 
the factor from the ribosome after its task has been 
accomplished (71). 

There are few published experimental data ad- 
dressing the function of the archaeal IF2-like factor. 
The only study performed in vivo showed that the 
M. jannaschii IF2 homolog can partially rescue yeast 
mutants lacking eIF5B (70), thus demonstrating that 
alF2 is to some extent functionally analogous to 
eIFSB. Preliminary in vitro data also suggest that 
S. solfataricus alF2 promotes the binding of met-tR- 
NAi to the ribosome (74). 

In contrast to the paucity of functional data, 
there is much detailed information about the struc- 
ture of aIF2. Crystallographic studies on the Metha- 
nothermobacter thermautotrophicus alF2 (98) show 
that it is characteristically shaped as a chalice (Fig. 4). 
The globular “cup” of the chalice (N-terminal region) 
includes three domains, the first of which is the gua- 
nine nucleotide-binding domain. The “stem” of the 
chalice is a long a-helix, while the globular “base” 
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Figure 4. The “chalice” structure of the archaeal IF2-like transla- 
tion initiation factor. The crystal structure of the archaeal transla- 
tion initiation factor aIF2, homologous to eucaryal eIF5B and bac- 
terial IF2, is shown. The four protein domains are indicated. Data 
taken from the NCBI structure data bank: PDB: 1G7T viewed with 
Cn3D 4.1. 


(domain IV) corresponds to the C-terminal domain 
known to bind f-met-tRNA in bacterial IF2 (38). The 
alF2 proteins are smaller, in general, than their eu- 
caryal and bacterial homologs, as they lack the long, 
poorly conserved and functionally uncharacterized 
N-terminal region that is present in both IF2 and 
eIF5B. The structures of alIF2-GDP and aIF2-GTP 
show some differences, suggesting that the protein 
undergoes a conformational change upon GTP hydro- 
lysis that modifies its affinity for the ribosome (98). 

The function of the second universal IF in ar- 
chaea (aIF1A) remains undetermined, although the 
X-ray structures of the factors from all three domains 
have been solved (10, 106). Recent experimental data 
suggest that aIF1A forms a relatively stable complex 
with aIF2 (74). A similar type of interaction occurs 
between eIF5B and eIF1A. Bacterial IF2 and IF1 do 
not form a complex in solution, although they may 
interact on the surface of the ribosome, as suggested 
by earlier cross-linking data (15) and by a recent 
cryo-electron microscopy study (2). On the basis of 
these data, it has been proposed that a complex of the 
IF1-like and IF2-like factors is a universal and ances- 
tral feature of translational initiation, whose con- 
served function would be to enhance the affinity of 
the P site for the initiator tRNA and to stimulate sub- 
unit joining (61). 

The small protein, aSUI (aIF1) has homologs in 
all eucarya (SUI1 in yeast; eIF1 in vertebrates) and 
in a limited number of bacterial species (YCiH in 
E. coli) (26). Phylogenetic analysis shows that SUI1 
is probably an ancestral factor that has been subse- 
quently lost by most bacteria, possibly because its 
function has been replaced by another protein (61). 
In archaea, aSUI1 interacts with the 30S subunits, 
although its precise function in translation initiation 
has not been determined. In eucarya, SUI1/eIF1 is an 
essential protein that controls the fidelity of initia- 
tion codon recognition and probably also elonga- 
tion (27). 

A very interesting initiation factor present only 
in the Archaea and the Eucarya is the 25-kDa protein, 
alF6 (eIF6). The function of this factor in the eucarya 
has been studied in some detail but remains some- 
what enigmatic. In yeast, eIF6 is an essential protein 
that is found in the nucleolus and in the cytoplasm, 
where it associates with the 60S ribosomal subunit (8, 
108). The main phenotype observed in conditional 
mutants lacking the factor is a defect in the synthesis 
of 60S ribosomes, specifically a block in the process- 
ing of the rRNA 26S precursor (9). However, the cy- 
toplasmic, 60S-bound eIF6 behaves as a ribosome an- 
tiassociating factor, preventing the formation of 80S 
particles and thereby inhibiting protein synthesis 
(108). According to a recent report, the dissociation 
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of eIF6 from mammalian ribosomes requires the 
phosphorylation of the factor, which is accomplished 
when certain environmental cues activate a specific 
kinase (21). Thus, eIF6 resembles eIF2 in being a gen- 
eral regulator of protein synthesis. It remains to be 
determined whether eIF6 can serve two functions in 
translation. Clearly, functional studies of aIF6 will be 
of fundamental value in advancing understanding of 
the cellular role of this interesting protein. 

The universal protein a(e)IFSA (EFP in bacteria) 
is usually classed as a translation initiation factor. 
However, since this protein does little to help the se- 
lection of the translation start site and functions as a 
specialized elongation factor, it is described in the fol- 
lowing section. 


Elongation 


Elongation is the core process of protein synthe- 
sis and as a result is extremely well conserved in the 
three domains. The three canonical phases of elonga- 
tion (adaptation, transpeptidation, and translocation) 
proceed with similar modalities in all cells. Likewise, 
in all three domains of life, elongation is assisted by 
the two highly conserved, paralogous elongation fac- 
tors, EF1A and EF2 (EFTu and EFG in bacteria, re- 
spectively). Both EFs are G-proteins that interact with 
the ribosome in the GTP-bound form and are released 
following GTP hydrolysis as inactive EF-GDP com- 
plexes. EF1A has a much higher affinity for GDP than 
for GTP and requires the guanine nucleotide ex- 
change factor, EF1B (EFTs in bacteria), to be recycled. 

Similar to most components of the translational 
apparatus, the archaeal elongation factors are closer 
in sequence to their eucaryal counterparts. Elongation 
factor-based evolutionary trees were used to infer the 
root of the universal tree, identifying the archaea and 
the eucarya as sister domains (46). 

Interesting new insight about the archaea was 
obtained from analysis of elongation, including the 
incorporation of the rare amino acids, selenocysteine 
and pyrrolysine, respectively dubbed the 21st and 
22nd amino acids of the genetic code (see Chapter 9). 

Selenocysteine (sec) is responsible for the pres- 
ence of the trace element selenium in proteins. Se- 
lenocysteine is incorporated during elongation in re- 
sponse to an internal UGA stop codon that precedes a 
specific RNA hairpin termed SECIS (selenocysteine 
insertion structure) (see Chapter 9). Incorporation re- 
quires the action of the specialized elongation factor 
SELB, which recognizes a specialized tRNA, sec- 
tRNAsec (34, 95, 100). This pathway is common to 
all three domains of life, although specific differences 
exist among them. In bacteria, the SECIS is adjacent 
to the UGA codon and is bound directly by SELB, 


which carries sec-tRNAsec, thereby leading to the in- 
sertion of sec in the appropriate position (95). In the 
Eucarya, the SECIS is instead placed in the 3'-UTR of 
the mRNA and is indirectly recognized by SELB with 
the aid of the adaptor protein SBP2 (34). The differ- 
ent SECIS-binding abilities of bacterial and eucaryal 
SELB are due to structural differences in the C-termi- 
nal domain of the protein (domain IV). In bacteria, 
the SELB domain IV is longer and can interact di- 
rectly with the SECIS, while in the eucarya it is shorter 
and requires the help of SBP2 to contact the SECIS. 

In the archaea, SELB homologs have only been 
identified in methanococci (100). Sequence analyses 
revealed that these proteins have a C-terminal do- 
main IV that is even shorter than that of their 
eucaryal homolog, suggesting that its interaction with 
the SECIS may also be mediated by an adaptor. In 
support of this, the SECIS elements of archaeal 
mRNAs encoding selenoproteins are in the 3’-UTR, 
similar to eucarya (99). However, a SBP2 homolog 
has not been identified in archaeal genomes. 

Very recently, the structure of M. maripaludis 
SELB was solved by X-ray crystallography, yielding 
interesting insights into the function of the protein 
(72). Remarkably, archaeal SELB has a shape resem- 
bling a chalice that is very similar to that of the initi- 
ation factor aIF2 (aIF5B) (Fig. 3). Since bacterial IF2 
is an RNA-binding protein that binds f-met-tRNA, it 
has been suggested that, despite the reduced size of its 
domain IV, the archaeal SELB may interact directly 
with the SECIS, similar to its bacterial counterpart. 
A computer model of the SELB/ribosome complex 
shows that domain IV points toward the 3'-mRNA 
entrance where the SECIS is likely to be located and 
an interaction established (125). These findings would 
explain the apparent lack of a SBP2 homolog in ar- 
chaea and raise the interesting possibility that aIF2 
(aIF5B) contacts the mRNA as a part of its function. 

While the synthesis of selenoproteins was estab- 
lished in bacterial systems, the discovery of pyrroly- 
sine was based in the Archaea. Pyrrolysine is a lysine 
derivative that was originally identified in methy- 
lamine methyltransferase genes of Methanosarcina 
barkeri and subsequently found in other methanogens, 
where it may play an exclusive role (44). Like sec, 
pyrrolysine is encoded by an UAG codon, which is 
decoded by an unusual tRNA with a CUA anticodon. 
The latter is encoded by pylT, located near a methyl- 
transferase gene cluster. The adjacent pylS gene en- 
codes a class II aminoacyl-tRNA synthetase that 
charges the py/T-derived tRNA with lysine; this 
tRNA synthetase is not closely related to known ly- 
syl-tRNA synthetases (128). However, unlike sec, the 
insertion of pyrrolysine into the growing polypeptide 
chain does not appear to require the action of a spe- 
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cialized EF, and the pyrrolysine tRNA is probably rec- 
ognized by EF-1A. It is also unlikely that pyrrolysine 
insertion requires a cis-acting mRNA element similar 
to the SECIS. Recent data suggest that UAG is rarely 
used as a termination codon in the species contain- 
ing pyrrolysine and may therefore be used as a nor- 
mal sense codon (128). 

Another interesting feature of translation elon- 
gation concerns the formation of the first peptide 
bond, which seems to require a specialized factor act- 
ing at the interface between initiation and elongation. 
The factor is a universal protein known as EFP in 
Bacteria and a(e)IF5A in Archaea and Eucarya. The 
factors from the three domains are clearly homolo- 
gous, although the archaeal and eucaryal polypep- 
tides have the highest level of sequence identity (63). 
Biochemical studies, mainly carried out on the bacte- 
rial protein, have established that it functions to cat- 
alyze the formation of the first peptide bond (35). 
Considerably more experimentation is required to de- 
termine its role in translation. 

In contrast, structural information is available 
for bacterial EFP and archaeal aIF5A. The bacterial 
factor is composed of three beta-barrel domains. It 
has an L-shaped structure reminiscent of a tRNA, 
suggesting that it might interact with the ribosomal 
tRNA binding site(s) by virtue of “molecular mim- 
icry.” It appears to bind both ribosomal subunits and 
to stimulate the peptidyltransferase center on the 50S 
particle (40). The archaeal factor (structures are avail- 
able for M. jannaschii, P. aerophylum, and P. hori- 
koshii alF5A) is somewhat shorter than its bacterial 
homolog and includes only two, instead of three, 
beta-barrel domains (49, 89, 122). Due to the struc- 
tural difference, it has a rodlike shape rather than an 
L shape and may therefore interact preferentially with 
the large ribosomal subunit, although no experimen- 
tal data are available to address this. A remarkable 
feature of aIF5A, which is shared with its eucaryal 
homolog, is the presence of a uniquely modified lysine, 
known as hypusine (N-e-(4-aminobutyl-2-hydroxy) 
lysine). A conserved lysine is present in the corre- 
sponding position in the bacterial protein, and it is 
not posttranslationally modified to hypusine. The 
functional role of hypusine is poorly understood, al- 
though it is known that inactivation of the hypusine- 
forming enzyme in yeast is lethal. Note that the pres- 
ence of hypusine in archaea was one of the first 
“eucaryal” features to be identified in the Archaea (7). 

EFP is one of the few bacterial translation factors 
that is larger than its archaeal/eucaryal counterparts. 
e/aIF5A may have evolved from an EFP-like ancestor 
following the loss of one of its three protein domains. 
Morevover, in archaea and eucarya, an unknown pro- 
tein may fulfill the function of the missing domain (40). 


Termination 


Translation terminates when a stop codon enters 
the ribosomal A site. In Archaea, Bacteria, and Eu- 
carya, the recognition of the stop codon and the re- 
lease of the completed protein from the ribosome is 
carried out by a set of proteins termed release or ter- 
mination factors (RFs). The mechanism of termina- 
tion is well understood in bacteria, while several as- 
pects are still to be elucidated in eucarya. In archaea, 
very few experimental studies have been performed 
that address this step of protein synthesis. The most 
significant information derives from in silico analysis 
of bacterial and eucaryal homologs of the termination 
machinery in archaeal genomes. The analyses have re- 
vealed that, similar to initiation and elongation, ar- 
chaeal termination involves eucaryal-like proteins. 

In bacteria and eucarya, class 1 termination fac- 
tors recognize stop codons and release the completed 
polypeptide by promoting the hydrolysis of the ester 
bond that is anchored to the tRNA in the P site. Bac- 
teria possess two class 1 termination factors: RF1 rec- 
ognizes UAA and UAG, and RF2 recognizes UAA 
and UGA. In contrast, the Eucarya appear to employ 
a single factor, eRF1, to recognize all three stop codons 
(54). All archaeal genomes include genes encoding a 
polypeptide homologous to eRF1 (referred to as 
aRF1), while no bacterial RF2 homologs have been 
detected. Therefore, the Archaea appear to resemble 
the Eucarya in the use of a single factor for stop 
codon recognition. Consistent with this, the M. jan- 
naschii aRF1 promotes termination on eucaryal ri- 
bosomes (31). Despite intensive computational analy- 
ses, no meaningful similarity has been detected between 
the bacterial and the archaeal/eucaryal class 1 RF, 
which indicates two distinct protein families have 
evolved (54). The structure of human eRF1 resembles 
a tRNA by having an elongated shape with two 
“arms,” similar to the acceptor and anticodon stems 
of a tRNA (110). This “molecular mimicry” may re- 
flect the fact that both RF1 and tRNA bind in the ri- 
bosomal A site. However, the precise mechanism for 
stop codon recognition by the RF1 proteins has not 
been determined. 

Given the analagous function of aRF1 and eRF1, 
the archaeal proteins may be expected to have a shape 
similar to their eucaryal counterparts. However, a 
unique feature of aRF1 proteins is that they are ex- 
tensively truncated at their C-termini in comparison to 
both bacterial and eucaryal RF1 (54). The C-terminal 
domain of RF1 proteins is involved in interacting with 
class 2 RF, and these appear to be absent in archaea. 

In addition to the class 1 RF, the Bacteria and the 
Eucarya, but not the Archaea, possess a class 2 RE, re- 
ferred to as RF3 in Bacteria and eRF3 in Eucarya. 
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Class 2 RF proteins are G-proteins that do not par- 
ticipate in the peptide release reaction itself. The func- 
tion of RF3 has been analyzed in some detail, and its 
main task seems to be to accelerate the recycling of 
class 1 RF proteins after translation termination (53, 
127). RF3 interacts with the ribosomes in a GDP- 
bound form, and the ribosome itself then functions as 
a guanine nucleotide-exchange factor (GEF) that pro- 
motes the release of GDP and the binding of GTP to 
the factor. RF3-GTP then catalyzes the release of 
RF1/RF2 from the ribosome and detaches following 
GTP hydrolysis. RF3 is not essential in bacteria (53), 
while its eucaryal ortholog eRF3 is an essential fac- 
tor that binds to eRF1. However, the cellular role of 
eRF3 remains enigmatic (54). 

In addition to class 1 and class 2 RFs, the Bacte- 
ria possess an additional essential termination factor, 
referred to as ribosome recycling factor (RRF). This 
factor appears to be only in bacteria, as no homologs 
have been identified in the Archaea or the Eucarya. 

Information to date indicates that the Archaea 
are endowed with a simplified version of the eucaryal 
translation termination mechanism that utilizes a single 
class 1 RF without a RF3 or RRF protein. Detailed 
experimental studies are required to determine whether 
the Archaea possess additional, unique termination 
factors that may perform the role of the RF3 and/or 
RRF proteins from the other two domains. 


Translational Regulation 


Few studies have examined translational regula- 
tion in archaea, and these have mainly addressed au- 
toregulation, i.e., regulation by a protein of its own 
mRNA. Autoregulation of r-protein operons of meth- 
anogenic archaea was first demonstrated approxi- 
mately ten years ago (41). Little has appeared in the 
literature since this time; progress may have been 
hampered by unavailability of suitable experimental 
tools (see Chapter 21). The mechanism of transla- 
tional autoregulation in methanogens appears to be 
essentially the same as in bacteria. A well-studied case 
is the M. vannielii L1 ribosomal protein operon (en- 
coding the r-proteins L1, L10, and L12), which is 
transcribed as a single polycistronic mRNA (41, 59, 
78). The regulatory protein L1 interacts preferentially 
with its binding site on the 23S rRNA and, when in 
excess, also binds to the regulatory target site of its 
mRNA, thereby inhibiting translation of all three 
cistrons in the operon. The regulatory mRNA site, a 
structural mimic of the rRNA-binding site for L1, is 
located within the L1 gene about 30 nt downstream 
of the ATG initiation codon (59). A similar regulatory 
mechanism also exists in M. thermolithotrophicum 


and M. jannaschii. However, it is doubtful that this 
mechanism exists in nonmethanogenic archaea (59). 
An interesting example of translational regula- 
tion that was recently identified in archaea involves 
cotranslational frameshifting, whereby a contiguous 
polypeptide is synthesized from two, out-of-frame 
ORFs (split genes). In genes with frameshifts, the gene 
is interrupted by an in-frame stop codon, and a full- 
length protein is produced by the ribosome-changing 
register and continuing translation. The presence of 
several split genes in archaea has recently been docu- 
mented (22). In at least one case, an uninterrupted 
polypeptide was shown to be synthesized following a 
programmed -1 frameshift (22), a mechanism that op- 
erates in bacteria. However, the details of this archaeal 
mechanism for frame-shifting have not been examined. 


PERSPECTIVE: THE NEXT FIVE YEARS 


The study of translation in archaea is still in its 
infancy; much more work is required before a com- 
prehensive understanding of its similarities and dif- 
ferences with the other two domains of life are deter- 
mined. Topics that are likely to progress rapidly in the 
near future are the following: 


Mechanism for Leaderless mRNA Translation 


Leaderless transcripts have attracted consider- 
able attention recently because they are produced in 
abundance by some archaea, they are decoded by a 
novel and poorly understood mechanism, and they 
have anticipated evolutionary importance. Under- 
standing leaderless translation will first require ex- 
tended structural analysis of the structure of the pu- 
tative leaderless transcripts, many of which have been 
identified solely on the basis of in silico studies. It will 
be important to compare the translational efficiency 
of true leaderless mRNAs that have their 5’ ends co- 
inciding with the initiation codon, versus quasi-lead- 
erless mRNAs that have a few nucleotides upstream 
of the initiation codon. It should be determined how 
the length of the short 5’-UTRs influence translation 
and whether there are any preferred base composi- 
tions or sequences that may affect translational effi- 
ciency. Another open question is whether an initiator 
AUG is required for optimal translation of leaderless 
mRNAs. The biogenesis of the leaderless mRNAs 
also needs to be explored, as it is possible that some 
may be generated by posttranscriptional processing of 
longer transcripts. A very important task will be to 
determine which initiation factors are required for 
leaderless mRNA translation. 
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Function of the Translation Initiation Factors 


The unexpected similarity between the archaeal 
and eucaryal IFs has attracted much attention and has 
prompted many structural studies of the archaeal 
proteins in the past few years. However, except for 
a/eIF2, the function of the putative archaeal transla- 
tion initiation factors remains undetermined. It is im- 
portant to establish which factors are required for the 
decoding of leadered and leaderless mRNAs, what is 
the precise function of the universal IFs in archaea (in 
particular, the IF2-like protein that has somewhat dif- 
ferent roles in bacteria and eucarya), and whether 
archaeal-specific IFs exist. An important prerequisite 
for enabling rapid advancement is to improve the ex- 
perimental tools for the molecular analysis of transla- 
tion. For example, the available techniques for per- 
forming molecular genetics in archaea are still far 
from being well standardized and of easy general use 
(see Chapter 21). However, such techniques are es- 
sential to enable the exploration of the function of the 
translation components by means of mutational and 
gene knockout studies. For example, in vivo studies 
are needed to determine which of the archaeal IFs are 
encoded by essential genes. 

Tools for in vitro analysis also need to be devel- 
oped. To date, cell-free systems for the translation of 
natural archaeal mRNAs have been described only 
for the crenarchaeon S. solfataricus (25). Clearly, in 
the archaeal field there is a general lack of highly re- 
solved systems (similar to those available for the Bac- 
teria and the Eucarya) for the biochemical dissection 
of translational events. Fortunately, this issue is be- 
ginning to be seriously addressed, and meaningful ad- 
vancements can be expected in the next few years. 


Translational Regulation 


Possibly the most interesting topic to be tackled 
in the next few years relates to the presence in archaea 
of translation regulation mechanisms similar to those 
in eucarya. This is indicated by the presence in the Ar- 
chaea of two IFs (a/eIF2 and aIF6), the homologs of 
which are important in regulating eucaryal protein 
synthesis. The activity of both proteins in eucarya is 
modulated by phosphorylation. eIF2 is functionally 
inactivated through the phosphorylation of its a- 
subunit. Recently, the a-subunit of a/eIF2 from 
P. horikoshii was reported to be phosphorylated in 
vitro by a specific kinase (111). This finding needs to 
be confirmed and substantiated by functional studies. 
The biological significance of a/eIF2 phosphorylation 
is not clear, since the protein has a similar affinity 
for GDP and GTP (90) and cannot be regulated by 


G-nucleotide exchange similar to its eucaryal coun- 
terpart. A possibility is that a-subunit phosphoryla- 
tion regulates the interaction between the a- and y- 
subunits, or the binding of the factor to the ribosome. 
The development of this line of research may eluci- 
date why the common ancestor of the Archaea and 
the Eucarya adopted a trimeric factor for carrying 
met-tRNAi to the ribosome. 

Another translation factor that seems to have a 
general regulatory role in eucaryal translation is eIF6. 
However, the function of this protein is poorly under- 
stood compared with elF2, as it appears to participate 
both in regulating 80S ribosome formation and in ri- 
bosome synthesis (9, 21, 108). Determining the func- 
tion of archaeal IF6 will shed light on the role of the 
eucaryal homolog. In general, the study of a/eIF2 and 
alF6 may lead to exciting new insights into the evolu- 
tionary history of not only the archaeal, but also the 
eucaryal mechanisms of translational regulation. 
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Features of Aminoacyl-tRNA Synthesis Unique to Archaea 
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INTRODUCTION 


Aminoacyl-tRNAs (aa-tRNAs) are the essential 
substrates for ribosomal protein synthesis and are 
central in ensuring accurate translation of the genetic 
message. Two processes are involved in the formation 
of aa-tRNA: (i) the transcription of tRNA genes and 
processing of the transcript to mature tRNA and 
(ii) the aminoacylation of the individual tRNA species 
with their proper (cognate) amino acid. These pro- 
cesses have been well studied in bacteria and eucarya; 
these studies led to the acceptance of a unified view of 
the mechanism of aa-tRNA synthesis. Thus, it was 
unexpected when studies of aa-tRNA formation in 
archaea revealed exceptions to this scheme. In this 
chapter, four unique aspects of archaeal aa-tRNA for- 
mation that led to a much deeper understanding of 
this process not only in the Archaea, but in all do- 
mains of life, are discussed. The topics are: (i) pro- 
cessing of half-tRNA genes to mature tRNA in Nano- 
archaeum equitans, (ii) RNA-dependent cysteine 
synthesis in methanogens, (iii) pyrrolysyl-tRNA for- 
mation in the Methanosarcinaceae, and (iv) gluta- 
minyl-tRNA synthesis in archaea. 


tRNA BIOSYNTHESIS 


tRNA molecules are formed by transcription of 
tRNA genes into longer precursor tRNA molecules 
that are converted to mature tRNAs by nuclease pro- 
cessing, enzymatic addition of the 3'-terminal CCA se- 
quence, and formation of the many modified nucleo- 
sides present in these ancient RNA molecules (30). 


A closer look at the properties of tRNAs of ar- 
chaea displays a variety of features that are either 
shared only with Eucarya or only with Bacteria, plac- 
ing them “tRNA-wise” in the middle of these two do- 
mains (34). The genomic distribution and promoter 
organization of tRNA genes serves as an evident ex- 
ample of this characteristic. Similar to the situation in 
Eucarya, archaeal tRNA genes are not clustered in 
operons but possess individual promoters. However, 
the promoters themselves display features that they 
share with Bacteria (21). Archaea possess a single 
complex RNA polymerase transcribing all genes, 
which, unlike the eucaryal transcription machinery, 
does not rely on conserved sequence elements within 
the tRNA gene. All bacterial and archaeal genomes 
instead display external promoters upstream of the 
tRNA genes. Comparisons of these upstream se- 
quences revealed a conserved TATA-like box A ele- 
ment (see Chapter 6), located approximately 25 base 
pairs upstream of the transcription start site (41). A 
second element, termed box B, contains the tran- 
scription start site; the RNA usually starts with a 
purine residue. To generate functional tRNA mole- 
cules, the 5’ end and the 3’ end of the primary tRNA 
transcript (pre-tRNA) need to be processed. 

At the 5’ end, this task is performed by RNase P 
in a single endonucleolytic cleavage of the pre-tRNA 
(18). RNase P is an essential ribonucleoprotein that is 
ubiquitously present in all tRNA-synthesizing cells 
and cellular compartments. The only known excep- 
tions to this are three genomes in which no definable 
RNase P RNA sequence could be identified. Apart 
from the bacterium Aquifex aeolicus, these organisms 
include Nanoarchaeum equitans and Pyrobaculum 
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aerophilum (33). RNase P is thought to recognize the 
tRNA structure of the precursor tRNA by a set of in- 
teractions between the catalytic RNA subunit and the 
tRNA’s T stem and acceptor stem, together with im- 
portant residues in the 5’-leader sequence and the 
CCA 3’ terminus. The 5’ end of the histidine tRNA ex- 
hibits a further domain-specific tRNA feature; it con- 
tains one additional guanosine compared with all 
other tRNA species, which was determined to be cru- 
cial for aminoacylation by histidyl-tRNA synthetase 
(10). In Eucarya this important additional residue is 
added posttranscriptionally by the enzyme tRNA™S 
guanylyltransferase (11, 20), while it is already en- 
coded in the bacterial and archaeal tRNA"'S genes 
whose transcripts are processed at position —1 by 
RNase P (8). 

Much progress in understanding 3’-end process- 
ing of pre-tRNA has been made recently (30). The 
RNase Z from Haloferax volcanii was shown to 
cleave tRNA precursors 3’ to the discriminator base, 
which resembles eucaryal 3'-tRNA processing (50). 
The cleavage efficiency of RNase Z was shown to be 
influenced by the length of the tRNA’s acceptor stem, 
whereas the 3’-terminal CCA sequence was not re- 
quired for activity (51). This CCA sequence, which is 
required for amino acid attachment to tRNA, is added 
to the 3’ end processed pre-tRNA by the CCA-adding 
enzyme, as most archaeal (and all eucaryal) tRNA 
genes do not encode the 3’-terminal CCA sequence. 


cis-splicing 


pre-tRNA gene 
sm >e 


Special attention should be drawn to introns that 
interrupt the continuity of many tRNA genes (Fig. 1). 
These sequences are removed during tRNA splicing 
to generate full-length tRNAs (1). While bacteria con- 
tain self-splicing tRNAs, eucarya and archaea use a 
classical enzymatic process. A splicing endonuclease 
excises the intron from the pre-tRNA to yield two 
tRNA half-molecules, a 5'-half-exon with a 2',3'- 
cyclic phosphate and a 3'-half-exon with 5'-hydroxyl 
terminus (32). These tRNA half-molecules are then 
joined by an RNA ligase. The homotetrameric ar- 
chaeal splicing endonuclease from Methanocaldococ- 
cus jannaschii is well characterized; its crystal struc- 
ture is known, and the enzyme recognizes a consensus 
bulge-helix-bulge structure (3 nt bulge, 4 bp, 3 nt 
bulge) found at the intron-exon junctions of intron- 
containing tRNAs, rRNA, and even mRNA (58, 62, 
66). The location of eucaryal tRNA introns is con- 
served; they are found between nucleotides 37 and 38 
of the tRNA, one nucleotide 3’ of the anticodon. How- 
ever, many archaeal tRNA genes contain introns with 
unique features; they occur at different positions, 
such as the tRNA acceptor stem, D loop, D stem, 
T loop, variable loop, and anticodon stem. Further- 
more, the BHB (bulge-helix-bulge) motifs postulated to 
form at the intron-exon junctions of archaeal tRNAs 
show divergence from the canonical structure (35). 
Introns with most deviation from the canonical BHB 
structure and position are found in the Crenarchaeota 


trans-splicing in Nanoarchaeum equitans 


5’ tRNA-half gene 
So I 3 


3’ tRNA-half gene 
s- i aaa 
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Splicing Splicing 
endonuclease endonuclease 
+ RNA ligase nealing + RNA ligase 
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bulge 
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— 
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Figure 1. Schematic representation of cis- and trans-splicing. Conventional cis-splicing involves a splicing endonuclease that rec- 
ognizes and cleaves a bulge-helix-bulge RNA motif in the pre-tRNA leading to the excision of the intron (black). An RNA 
ligase generates the mature tRNA (gray). The unique trans-splicing of tRNA observed in N. equitans requires the annealing 
of intervening reverse complementary sequences (black) found in the primary transcripts of a 5’-tRNA half-gene and a 3’-tRNA 
half-gene. Noncanonical splicing motifs are accommodated at the junctions and recognized by the N. equitans splicing 


endonuclease. 
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and in N. equitans. Their occurrence correlates with 
the presence of a novel heteromeric (a8 ) splicing 
endonuclease that was shown to recognize more di- 
vergent splicing motifs (9, 62). A tRNA ligase that 
joins the two tRNA exons has been partially charac- 
terized in H. volcanii (71). 


SPLIT tRNA GENES IN N. EQUITANS 


Bioinformatic analysis of the N. equitans genome 
sequence uncovered a phenomenon unique to this ar- 
chaeon (45, 46). In addition to 34 normal tRNA genes, 
four (tRNA, tRNAM*, tRNA™P, and tRNA") 
were discovered to have an intron. They were expected 
to be processed by a normal cis-splicing process (Fig. 1). 
However, the genes for tRNA”, tRNAS", tRNAs, 
tRNAM*, and tRNA® were encoded as half-genes 
scattered throughout the genome. Biochemical analy- 
sis showed that N. equitans utilizes a unique trans- 
splicing mechanism to obtain full-length tRNA species 
from these half-genes; this requires the assembly of 
individually expressed RNA precursors encoding the 
5'- and 3'-tRNA halves (44-46). The tRNA halves 
form a 12- to 14-nucleotide GC-rich RNA duplex 
between the end of the 5'-tRNA half and the begin- 
ning of the 3’-tRNA half. This structure probably 
provides the primary nucleation site that facilitates 
folding of the tRNA body. Noncanonical BHB motifs 
form at the junction between this RNA duplex and 
the tRNA; these RNA molecules could be cleaved 
in vitro by the heteromeric splicing endonuclease of 
N. equitans (44) or Sulfolobus solfataricus (63) but 
not by the canonical archaeal enzyme from Ar- 
chaeoglobus fulgidus (44). Why is N. equitans the 
only known organism with split tRNA genes? Since 
site-specific integration of archaeal viruses or con- 
jugative plasmids occurs exclusively at tRNA genes 
(53), one may speculate that there might be an adap- 
tive value for such a small genome in providing re- 
sistance to the integration of mobile DNA elements 
by tRNA half-genes. 


AMINOACYL-tRNA FORMATION 


Once mature tRNA has been generated each 
tRNA species needs to be acylated (charged) with the 
correct amino acid. This occurs in the process of 
aminoacylation (reviewed in detail in reference 26) 
which “matches” the amino acid carried by the tRNA 
with the corresponding anticodon (see Chapter 8). 
This is primarily achieved by the direct attachment 
of an amino acid to the corresponding tRNA by an 
aminoacyl-tRNA synthetase. However, since many 


organisms lack the complete set of 20 aminoacyl- 
tRNA synthetases (aaRSs), many biochemical, ge- 
netic, and genomic studies revealed the existence of 
an essential indirect two-step pathway that also pro- 
vides correctly charged aa-tRNA. 


Direct Aminoacylation of tRNA 


The aminoacyl-tRNA synthetases are an ancient 
family of enzymes that esterify an amino acid to the 
3’ end of the cognate tRNA species. By using ATP to 
activate the amino acid, they display remarkable 
specificity solely for the cognate substrates. A further 
quality control step of many synthetases is an editing 
activity that hydrolyzes incorrectly activated amino 
acids or mischarged aa-tRNAs (24). 


tRNA-Dependent Amino Acid Conversion 
of Misacylated Aminoacyl-tRNA 


Several essential aa-tRNA species are created by 
this indirect route (Fig. 2). This pathway is mainly 
used for the synthesis of GlIn-tRNA and Asn-tRNA (in 
Bacteria and Archaea) (64), of Sec-tRNA (in all three 
domains) (5), and for Cys-tRNA (in methanogens) 
(49; see below). The first step in this indirect pathway 
relies on a nondiscriminating aaRS with relaxed tRNA 
specificity to generate a misacylated tRNA. For in- 
stance, the Gln-tRNA biosynthesis by this route takes 
advantage of a nondiscriminating GluRS to synthesize 
Glu-tRNA©" (Fig. 2B). Then a tRNA-dependent het- 
erotrimeric amidotransferase (GatCAB) converts the 
misacylated aa-tRNA by amidation to the cognate 
Gln-tRNAS (12). A similar pathway exists to form 
Asn-tRNA“" using a nondiscriminating AspRS and 
the same tRNA-dependent amidotransferase (Gat- 
CAB) (13, 38). However, in Archaea, Gln-tRNAS™ is 
made by the archaea-specific heterodimeric GatDE en- 
zyme. The current understanding of the Methano- 
thermobacter thermautotrophicus GlutRNAC"™ ami- 
dotransferase GatDE is discussed below (see “The 
mechanism of Gln-tRNA formation”). 

The biosynthesis of Sec-tRNAS®: is known in 
bacteria (5). It involves the formation of Ser-tRNAS®° 
by seryl-tRNA synthetase and then the Ser — Sec 
conversion by selenocysteine synthase. Currently, it 
is unclear how Sec is formed in eucarya or archaea, 
but it is believed not to follow the bacterial route. 


Cys-tRNAC’s FORMATION IN ORGANISMS 
LACKING A CANONICAL CysRS 


Cysteine is part of the twenty canonical amino 
acid repertoire ubiquitously used for protein synthesis. 
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Figure 2. Schematic representations of indirect pathways for aminoacyl-tRNA formation. (A) Route of Cys-tRNA formation in 
M. jannaschii, and structures of phosphoserine and cysteine. (B) The indirect route for Gln-tRNAS™ formation. 


Cysteinyl-tRNA synthetase is a class I aaRS that pro- 
duces Cys-tRNA©’S, an essential substrate for inser- 
tion of cysteine into proteins. While cysteine is pre- 
sent in M. jannaschii, M. thermautotrophicus, and 
Methanopyrus kandleri proteins to a similar extent as 
found in other organisms, the genomes of these or- 
ganisms do not encode a canonical CysRS. Despite in- 
tensive efforts by different groups, the question as to 
how Cys-tRNA©S might be synthesized in these or- 
ganisms remained unanswered for more than a decade 
(2). A novel approach using the combination of 
anaerobic biochemical purification and proteomic 
analysis of the chromatographic fractions finally al- 
lowed the resolution of this intriguing puzzle (49). 
Starting from a cell-free M. jannaschii extract and 
employing a rigorous identification of aa-tRNA by 
acid gel electrophoresis and Northern blot, two pro- 
teins and two low-molecular-weight factors were dis- 
covered to be essential for Cys-tRNA©"S formation 
(Fig. 2A). The first protein, an unusual aaRS homol- 
ogous to the PheRS a-subunit, selectively acylates 
tRNA‘ with phosphoserine (Sep) (Fig. 2A). The sec- 
ond enzyme subsequently converts Sep-t-RNA©’S to 
Cys-tRNA©’ in the presence of pyridoxal phosphate 
and a still unidentified sulfur donor. The two enzymes 
were named phosphoseryl-tRNA synthetase (SepRS, 
encoded by sepS) and Sep-tRNA:Cys-tRNA synthase 
(SepCysS), respectively. Searches in genomic data- 
bases revealed the presence of homologs of these pro- 


teins not only in M. jannaschii, M. kandleri, and 
M. thermautotrophicus but also in genomes of other 
archaea such as M. maripaludis, Methanococcoides 
burtonii, Methanospirillum hungatei, A. fulgidus, and 
the Methanosarcinaceae, organisms that already pos- 
sess the canonical CysRS (49). 

An indication of the physiological and evolu- 
tionary significance of such duplication in these 
organisms was provided by genetic experiments in 
M. maripaludis. While the inactivation of the cysS gene 
encoding the canonical CysRS had no effect on cell 
growth in different media (54), a M. maripaludis strain 
with a sepS deletion displayed cysteine auxotrophy 
(49). The dispensability of the canonical CysRS to- 
gether with the auxotrophic nature of the M. mari- 
paludis sepS knockout strain demonstrates that (i) in 
the presence of exogenous cysteine the SepRS/ 
SepCysS pathway and CysRS are functionally equiv- 
alent in Cys-tRNA synthesis and (ii) the indirect route 
to Cys-tRNA is the sole source of free cellular cys- 
teine in this organism (49). In M. maripaludis the 
SepRS/SepCysS pathway provides the cell with both 
Cys-tRNA® and free cysteine. Most of the other 
archaea have genes homologous to genes involved in 
cysteine biosynthesis in bacteria and eucarya. In these 
organisms, the physiological significance for the 
coexistence of the SepRS/SepCysS pathway with a 
tRNA-independent cysteine biosynthesis pathway 
and a CysRS is not yet understood. Possibly the 
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SepRS/SepCysS pathway is still involved in free cys- 
teine biosynthesis under specific growth conditions. It 
is also possible that the coexistence in M. maripaludis 
of both a direct and an indirect route to Cys-tRNA 
may correspond to a transient evolutionary stage that 
will eventually result in the displacement of the indi- 
rect pathway. 

Analyses of the phylogenetic distribution have 
shown that the direct and indirect pathways are of 
equally ancient origin (40). In M. jannaschii, M. kan- 
dleri, and M. thermautotrophicus the SepRS/SepCysS 
pathway may be strictly restricted to Cys-t-RNAS 
formation as free cysteine might not only be absent 
from the cytoplasm but also possibly toxic for the 
cell. Indeed, no homologs of tRNA-independent cys- 
teine biosynthesis genes can be identified in the 
genome sequence of these organisms. Consequently, 
these organisms might alleviate the lack of free cys- 
teine by using inorganic sulfur source for various cel- 
lular metabolic needs. While M. jannaschii tRNA 
molecules were shown to contain thio-nucleotides 
(e.g., 2-thiouridine [36]), the cysteine desulfurase en- 
zymes that activate the sulfur atom from cysteine for 
tRNA modification appear to be lacking. However, 
the metal-cluster proteins that act as a sulfur carrier 
between the cysteine desulfurase and the tRNA mod- 
ification enzymes are widespread in these organisms 
(67), strongly suggesting that sulfur atoms required 
for tRNA modification originate from an inorganic 
source. Fixation of sulfur from inorganic sulfide or 
sulfite was also shown to allow synthesis of other 
metabolites in M. jannaschii such as homocysteine 
and phosphosulfolactate, key intermediates in me- 
thionine and coenzyme M biosynthesis, respectively 
(19, 67). Finally, M. jannaschii ProRS was shown to 
efficiently mischarge in vitro cysteine onto tRNA?P’®, 
thus potentially compromising translational fidelity 
and subsequently cell viability (2, 57). While organ- 
isms from the bacterial and eucaryal domains have 
dealt with the danger of a promiscuous ProRS by ac- 
quiring an editing mechanism able to clear mischarged 
Cys-tRNA?’? (3, 48), no such mechanism can be iden- 
tified in archaea and in M. jannaschii in particular. 
The absence of selective pressure to acquire a Cys- 
tRNA?’ editing mechanism could be explained by 
the absence of formation of the mischarged tRNA 
species due to a low ratio of cysteine:proline concen- 
trations in the cell. Maintaining a low cellular cys- 
teine concentration is then crucial. 

Phylogenetic analyses suggest that class I CysRS 
emerged in bacteria and spread in archaea through 
multiple horizontal gene transfer events (69). The 
SepRS/SepCysS pathway appears to be the ancestral 
archaeal/eucaryal specific tRNA cysteinylation and 
cysteine biosynthesis system. By separating the amino- 


acylation and amino acid biosynthetic functions, the 
acquisition of both a tRNA-independent cysteine bio- 
synthetic route and a CysRS may have provided a sig- 
nificant evolutionary advantage for the organisms, 
thus explaining the progressive displacement of the 
ancestral SepRS/SepCysS pathway. A still ongoing 
evolutionary process for the establishment of mod- 
ern day tRNA cysteinylation metabolism is in good 
accord with the idea that cysteine may have been a 
late addition to the genetic code (7). Discovery of the 
tRNA-dependent cysteine biosynthetic route in 
M. jannaschii may have implications that reach far 
beyond the only problem of the formation of Cys- 
tRNA’ in three methanogenic archaea. Indeed, the 
fact that cysteine biosynthesis in M. jannaschii pro- 
ceeds via Sep-tRNA©S and a subsequent Sep > Cys 
conversion catalyzed by SepCysS suggests that the 
same (or a similar) enzyme may also carry out the Sep 
— Sec transformation, the missing step in Sec forma- 
tion in archaea and eucarya (49). 


PYRROLYSINE: AN UNEXPECTED AMINO 
ACID DISCOVERED IN METHANOSARCINA 


The Methanosarcinaceae are an exception among 
the methanogens, as they are able to use compounds 
like methanol, methylated thiols, and methylamines 
as energy sources (6). Methylation of coenzyme M 
(CoM) initiates methanogenesis; the different path- 
ways of trimethylamine-, dimethylamine-, or mono- 
methylamine-specific CoM methyl transfer require 
different methyltransferases responsible for the de- 
methylation of these energy sources and subsequent 
methylation of a different corrinoid protein that is 
subsequently demethylated by a methylcobamide: 
CoM methyltransferase. 

When the Methanosarcina barkeri methylamine 
methyltransferase genes were investigated, each of 
them was shown to contain an in-frame amber (UAG) 
codon that does not act as a translation stop during 
synthesis of the methyltransferase. Initial attempts to 
characterize by mass spectrometry the UAG-encoded 
residue in the M. barkeri monomethylamine methyl- 
transferase MtmB protein, identified lysine (29). 
However, the crystal structure of this enzyme revealed 
the UAG-encoded residue to be a new amino acid 
(22). The structure (Fig. 2) showed that lysine was 
modified with a C4-substituted pyrroline derivative. 
This 22nd cotranslationally inserted amino acid was 
named pyrrolysine (22). The identity of the C4-sub- 
stituent was initially not established, but a later crys- 
tal structure of MtmB combined with chemical syn- 
thesis (23) of the amino acid characterized the Pyl 
structure as having a methyl group. This was confirmed 
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very recently, as mass spectra of native preparations 
of the three different methyltransferases agreed with 
the expected mass number of proteins containing one 
pyrrolysine residue (55). 

Concurrent with the detection of pyrrolysine was 
another important finding: the discovery of a UAG 
suppressor tRNA in M. barkeri (56) which might be 
the tRNA that will be charged with Pyl and that de- 
codes (on the ribosome) the UAG codon with Pyl. 
This tRNA?! (encoded by py/T) has an unusual tRNA 
structure (56) and has only two modified nucleosides 
(43). The pylT gene is in an operon arrangement with 
three other open reading frames, pylS, pylB, and pylC. 
PylS appeared to encode a class II aminoacyl-tRNA 
synthetase-like protein. The other three genes are 
thought to be involved in pyrrolysine biosynthesis 
(56). Based on the presence of py/S and pylT and the 
knowledge of many genomes, the machinery for Pyl 
insertion appears to be present in the Methanosarci- 
naceae family and in the bacterium Desulfitobac- 
terium hafniense (56). In addition, a proteomic analysis 
showed the presence of the pyl genes in the Antarctic 
archaeon Methanococcoides burtonii (17). 

Originally it was thought (43, 56) that the mech- 
anism of Pyl-tRNA?! synthesis involved an initial 
formation of Lys-tRNA?»! and subsequent Lys > Pyl 
conversion in a tRNA-dependent manner as is known 
for Sec-tRNA‘® (5) and for Cys-tRNA® synthesis in 
M. jannaschii (49). This idea was supported by the re- 
port that PylS was able to charge tRNA?! with Lys 
(56) and by the unexpected finding that the combi- 
nation of M. barkeri class I LysRS (25) and M. barkeri 
class II LysRS was required to form Lys-tRNA?” (43). 
It is pertinent to mention that, in archaea, only the 
Methanosarcinaceae contain both classes of lysyl- 
tRNA synthetase. A ternary complex of two aaRSs and 
one tRNA has been modeled (59) and may have played 
a role in the evolution of the two classes of tRNA syn- 
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thetases (47). Although these aminoacylation results 
suggested tRNA-dependent amino acid modification 
as the path to Pyl-tRNA?”, one could not dismiss the 
possibility that Pyl was present as a metabolite ready 
to be charged directly onto tRNA?" by a special aaRS. 
Analysis of the M. barkeri genome revealed only pylS 
as the remaining unannotated synthetase-related gene. 
Using chemically synthesized Pyl it was demonstrated 
that PylS forms Pyl-tRNA?*! but not Lys-tRNA?! 
(Fig. 3) (42). Although these in vitro data were in dis- 
agreement with the earlier PylS study (56), they were 
fully supported by an independent investigation (4). 
Addition of Pyl to the growth medium of Escherichia 
coli expressing the M. barkeri mtmB gene gave further 
in vivo support to the notion that PylS specifically 
charges Pyl to the cognate tRNA?! (4). Therefore 
PylS was called pyrrolysyl-tRNA synthetase (PyIRS). 

How does Pyl get inserted into proteins? Is the 
particular UAG codon reassigned (recoded) and specif- 
ically identified to the ribosome during translation? 
Or is Pyl inserted at random UAG codons in an in- 
efficient process by the tRNA?! amber suppressor? 
To consider this it is important to recall how UGA 
is recoded for Sec; this is achieved by an essential 
EF-Tu-like protein (SelB) and a particular hairpin 
structure present in the mRNA (SECIS element) of the 
selenoprotein (5). 

The bioinformatic investigation of the Methano- 
sarcinaceae genomes revealed no genus-specific open 
reading frames similar to EF-Tu except for signifi- 
cantly truncated SelB paralogs. Thus, Pyl-tRNA?! 
may be delivered to the ribosome by the normal elon- 
gation factor(s). Such a possibility is supported by the 
fact that Pyl-tRNA?»! in E. coli behaved like a nor- 
mal amber suppressor (4); consistent with this is the 
observation that an anticodon mutated Lys-tRNA?»! 
is recognized by Thermus thermophilus EF-Tu (60, 70). 
However, analyses with RNA-folding programs pre- 
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Figure 3. Schematic representation of the pyrrolysyl-tRNA?!! formation by PylRS and structures of pyrrolysine and lysine. 
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dicted hairpin-like structures (named PYLIS) six 
nucleotides downstream of the “pyrrolysine” UAG 
codon in the mtmB genes of the Methanosarcinaceae 
(39) and the D. hafniense mttB gene (28). A synthetic 
“PYLIS element” based on the M. barkeri mtmB1 se- 
quence was shown by chemical and enzymatic prob- 
ing experiments to fold in the predicted way (61). A 
global bioinformatic survey of the genes encoding se- 
lenoproteins and expected Pyl-containing proteins 
notes that the mechanism of UGA/Sec recoding may 
differ from that of the UAG/Pyl reassignment, and 
strongly suggests that Pyl-tRNA?»! may act as a nor- 
mal suppressor tRNA (70). 


THE MECHANISM OF Gln-tRNA 
FORMATION 


The tRNA-dependent transamidation pathways 
are the primordial route of GlIn-tRNA and Asn-tRNA 
formation (Fig. 2B) (27, 69). The resulting tRNA 
aa-tRNA amidotransferases are exciting enzymes as 
they are the result of a recruitment of amino-acid- 
metabolizing enzymes (amidases and asparaginases) 
to become components of enzymes crucial in protein 
synthesis. The best-characterized enzyme to date is 
the heterodimeric archaea-specific Glu-tRNAS™ ami- 
dotransferase, GatDE (Fig. 4) (14, 54, 64). GatDE, 
like GatCAB, catalyzes three unique but coupled re- 
actions. (i) The enzyme is a kinase that forms the 
activated intermediate y-phosphoryl-Glu-tRNAC™ 
(P-Glu-tRNA®") by ATP hydrolysis (68). (ii) The 
enzyme is a glutaminase that generates ammonia by 
glutamine or asparagine hydrolysis (14). (iii) The 
enzyme is an amidotransferase that, using the am- 
monia liberated in step 2, amidates P-Glu-tRNAS™ 
to Gln-tRNA®". 

Structural and sequence homology along with 
biochemical results implies that GatD belongs to the 
family of L-asparaginases and that it is the enzyme 
subunit responsible for liberating ammonia (14, 54, 
64). L-Asparaginases generate ammonia by hydrolyz- 
ing Gln > Glu or Asn —> Asp. Mutational studies of 
M. thermautotrophicus GatD have confirmed the 
importance of four residues conserved in asparagi- 
nase active sites (T101, T177, D178, and K254 in 
M. thermautotrophicus GatD) for the glutaminase ac- 
tivity of GatD (14). In addition, GatD dimerizes in a 
fashion similar to other type I asparaginases; the 
structure of the Pyrococcus abyssi GatD homodimer 
is superimposable with the L-asparaginase from the 
same organism (54). 

GatE along with GatB belongs to an isolated 
protein family (54, 64). An insertion domain of about 


190 aa present in GatE is the major difference be- 
tween the two proteins (61). Structurally the do- 
main resembles a domain found in the bacterial As- 
pRS enzymes and may explain why GatDE is only 
able to use Glu-tRNA©" as a substrate (54). GatE 
has been implicated in the kinase activity of the 
holoenzyme, responsible for the formation of P-Glu- 
tRNA! (14). This activated intermediate has not 
yet been experimentally trapped with pure GatDE, 
as had been accomplished using crude Bacillus sub- 
tilis extract (68). A recent structure of the M. therm- 
autotrophicus GatDE complexed to tRNAC™ shows 
GatE binds tRNA in a cradle domain lined by amino 
acids conserved in the GatE/GatB protein family. 
Those residues are involved in making up the ki- 
nase and transamidase active sites of GatE (Fig. 4) 
(40a, 54). 

The mode of interaction of the two enzyme sub- 
units is now known (Fig. 4). A tunnel, made up by a 
series of very conserved aa residues in the two sub- 
units, connects the asparaginase active site in GatD to 
the amidotransferase active site on GatE and allows 
the delivery of ammonia for the final amidation step. 
It appears that in the P. abyssi apoenzyme structure 
(54) the tunnel is closed and the catalytically impor- 
tant Thr in GatD (14) is 7 A away from the GatD ac- 
tive site (54). This suggests that, upon Glu-tRNA 
binding, conformational changes occur in GatDE 
which move the Thr into a position enabling cataly- 
sis and opening the tunnel to channel the resulting 
ammonia for amidation of Glu-tRNA bound to GatE 
(54). Such conformational changes would explain the 
biochemical data that GatDE is only able to function 
as a glutaminase in the presence of Glu-tRNA© (14), 
implying a tight coupling between the three activities 
of Glu-tRNAS" amidotransferase. 

Why do archaea have a unique enzyme for Gln- 
tRNA synthesis that was not replaced during evolu- 
tion by lateral gene transfer with GlnRS (69)? In ar- 
chaeal proteins, Asn and Gln are underrepresented 
relative to their levels in proteins from mesophilic 
bacteria and eucarya (37). Thus, the presence of the 
“more efficient” GlnRS would not provide the same 
selective advantage it might give to some bacteria and 
to eucarya. Differences in tRNA identity elements 
between archaeal and eucaryal or bacterial (RNAS! 
sequences may also explain why glnS did not trans- 
fer to archaea (64); E. coli and yeast GInRS enzymes 
are unable to aminoacylate M. thermautotrophicus 
tRNAS' unless this tRNA species was given the 
E. coli tRNA®" identity elements (64). Differences 
in amino acid metabolism might be another reason 
why GlnRS is not found in archaea and might ex- 
plain why GatDE developed uniquely, since GatD em- 
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M. thermautotrophicus GatDE:tRNA“" Complex 


GatD (Asparaginase subunit) 


Asparaginase active site 
(Gin + H,O — Glu + NH, 


GatE (Amidotransferase subunit) 


Amidotransferase Active Site 
(Glu-tRNA™ + NH, — Gin-tRNA“™ + H,O) 


Figure 4. (See the separate color insert for the color version of this illustration.) Crystal structure of the M. thermautotrophicus 
GatDE complexed with tRNAS™, The dimer of the heterodimeric GatDE (thus forming a heterotetramer) binds two tRNA 
molecules. The asparaginase active site of GatD and the kinase/amidotransferase active site of GatE are distantly separated, 
connected with a “molecular tunnel.” 
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ploys a different catalytic strategy in generating am- 
monia compared with GatA (64). Much remains to 
be done to elucidate the role of GatDE in archaeal 
protein synthesis and cell metabolism. 


PERSPECTIVE: THE NEXT FIVE YEARS 


Studies on aaRSs in archaea have led to an ex- 
citing expansion of our knowledge of how the Ge- 
netic Code is translated and expanded. The use of de- 
signed orthogonal aminoacyl-tRNA synthetase/-RNA 
pairs (15) led to exiting achievements of incorpora- 
tion of unusual amino acids into proteins (summa- 
rized in reference 65). While such studies have relied 
on the in vitro design of novel tRNA and aaRS pairs, 
it is likely that the “naturally evolved” PyIRS: 
tRNA?! or SepRS:tRNA® pairs may prove far more 
efficient. Further studies on PyIRS or SepRS mediated 
recoding in archaea may be beneficial for devising 
more efficient systems for incorporation of nonnat- 
ural amino acids in other systems (31). 
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Chapter 10 


Protein-Folding Systems 


FRANK T. ROBB, RYO IZUKA, AND MASAFUMI YOHDA 


INTRODUCTION 


All organisms face a protein-folding problem, 
which is the requirement to convert their proteins 
from a random coil conformation emerging from the 
ribosome into homogeneous, precisely folded states. 
For survival and efficient growth under normal con- 
ditions, cells must be able to maintain the majority 
of their proteins in native conformations and to re- 
cover proteins damaged by exposure to stressors. Pro- 
tein folding in the Archaea is a fascinating topic be- 
cause many archaeal species are able to grow, and 
seemingly fold and salvage their proteins, in extreme 
conditions that might be expected to preclude proper 
folding. 

Archaeal hyperthermophiles grow at tempera- 
tures up to 113°C (6), and possibly as high as 121°C 
(6, 38), archaeal psychrophiles as low as —2°C (11). 
Thermoacidophiles grow in solfataric environments 
at pH 0 to 1, and extreme halophiles in evaporative 
salt deposits supersaturated with Na*, Kt, or Ca?* 
(10, 73) (see Chapter 2). Adaptations to environ- 
mental extremes include the synthesis of heat-stable 
and impermeable phytanyl ether-linked membrane 
lipids (20) (see Chapters 2 and 15) and exceptionally 
stable outer surface glycoproteins (14) (see Chapter 
14) that protect the interior milieu of the cells from 
the harsh external environment. However, microbial 
cells cannot be thermally insulated, and all the com- 
ponents of cellular metabolism, including the protein- 
folding pathways, must be adapted to function un- 
der the prevailing temperature regimes. 

The compositional and conformational adapta- 
tions of proteins that result in intrinsic stability un- 
der moderate or extreme conditions are fairly well 
understood (80). Protein adaptations include highly 


charged exterior surfaces, rigid structures maintained 
by multiple ion pair networks, tight hydrophobic core 
packing, and overall compact protein structures 
achieved by increased packing density to minimize in- 
ternal voids (96). Adaptive amino acid substitutions 
can be detected through comparisons of protein se- 
quences and compositional biases in large data sets. 
High-temperature adaptation is associated with a 
high content of the charged amino acids (lysine, argi- 
nine, glutamate, and aspartate) that will promote in- 
creased surface charge and ion pair formation. Other 
compositional changes include a higher residue vol- 
ume and a decrease in charged nonpolar amino acids 
on the surface of proteins (26) (see Chapter 16). In 
extremely stable proteins, high intrinsic stability of 
the proteins affects folding processes as additional de- 
solvation energy is required to localize the charged 
residues within the hydrophobic interior of the pro- 
tein (15, 16). The maintenance of functional pro- 
teomes in hyperthermophiles thus raises questions 
relating to the energetics of folding at very high tem- 
peratures, a topic that is poorly understood in terms 
of energetics and the pathways of chaperone activity. 
This chapter describes the known members of ar- 
chaeal protein-folding pathways, including not only 
the heat-shock-regulated members, but also the non- 
heat-shock-regulated protein chaperones. 


GENOME SIZE AND COMPLEXITY OF THE 
REPERTOIRES OF PROTEIN CHAPERONES 


The published genomes of archaeal species span 
a tenfold size range, from the 0.49-Mbp genome of 
Nanoarchaeum equitans to the 5.75-Mbp genome of 
Methanosarcina acetivorans (see Chapters 2 and 19). 
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Coding density in these circular genomes is high, with 
N. equitans, a parasitic hyperthermophile, having the 
smallest cellular genome and providing the highest 
coding density recorded to date (99). The hyperther- 
mophiles, with optimal growth temperatures (Topt) 
>80°C, tend to have smaller genomes than meso- 
philic species such as Halobacterium strain NRC1 
and the Methanosarcina species. As a result, protein- 
folding strategies are relatively simple or more com- 
plex depending on genome size and the number of 
homologs and paralogs that they encode. 

Inventories of chaperones found in the genomes of 
Archaea include representatives of protein families 
characteristic of Eucarya, including the prefoldins, 
small heat shock proteins (sHsp), and ATP-dependent 
chaperonins (Table 1). Two major classes of eucaryal 
chaperones (Hsp100 and Hsp90/Hsp83) are absent in 
archaeal genome sequences. The chaperones that are 
shared with bacteria, including the “Chaperone Ma- 
chine” (60), which is composed of Hsp70 (Dnak), 
Hsp40 (DnaJ), and GrpE, only occur in the larger com- 
plete genome sequences of mesophilic archaea (58, 61), 
as well as the psychrophilic methanogen, Methanococ- 
coides burtonii (21; R. Cavicchiolli, personal commu- 
nication). Hyperthermophiles represented by Pyrococ- 
cus spp, Sulfolobus spp, Pyrobaculum aerophilum, 
Methanocaldococcus jannaschii, Methanopyrus kand- 
leri, Archaeoglobus fulgidus, N. equitans, and Picro- 
philus torridus (78) do not have Hsp90, DnaK, DnaJ, 
GrpE, Hsp33, and Hsp10 homologs (Table 1). The 
smaller archaeal genomes lack the highest molecular 
weight chaperones found in eucarya. The Hsp100/Clp 
protein family contains the largest ATP-dependent heat 
shock proteins so far characterized. 

In Escherichia coli, degradation of denatured 
proteins is mediated by the cooperative functions of 
the ClpA and ClpP proteins. ClpA has protein re- 
modeling functions in addition to “protein repair” 


Table 1. Occurrence of different classes of HSPs 
in the three domains 


Occurrence of HSPs in: 
HSP 


Bacteria Eucarya Archaea 

HSP100s ClpA, ClpB, HSP100 Absent? 

HslU 
HSP90s HtpG Hsp90, Hsp83 Absent 
HSP70s Dnak Hsp70, Hsc70 Hsp70° 
HSP60s GroEL/ES TCP-1 TF55, Thermo- 

some, Cpn60 

sHSPs IbpA, IbpB sHSp, sHSP 

(E. coli) a-crystallin 


4ClpA/B homologs are found in M. thermoautotrophicum. 

’Hsp70s are absent in most thermophiles and hyperthermophiles except 
A. pernix, in which the putative mitochondrial HSP70 has been identified 
from the complete genome sequence. 


functions. In E. coli, ClpA alone can reactivate repli- 
cation initiator protein, RepA, from an inactive RepA 
dimer to an active RepA monomer (90). The major 
chaperone classes, Hsp100 and Hsp90/Hsp83, are 
absent from the genomes of the hyperthermophilic ar- 
chaea, although they are present in several mesophilic 
and thermophilic archaea (Table 1). For example, 
the thermophilic methanogen, Methanothermobac- 
ter thermautotrophicus contains a ClpA/B homolog, 
which could have been acquired by lateral gene trans- 
fer from bacteria. 

In Saccharomyces cerevisiae, Hsp100 and Hsp- 
104 have important roles in acquired thermotoler- 
ance (57). A characteristic feature of Hsp100/Clp, 
Hsp104, and HslU/V (the high-molecular-weight 
chaperones) is the presence of conserved AAA* do- 
mains. These AAA* heat shock proteins share se- 
quence similarity with the CDC48 and NSF proteins 
in the S. cerevisiae and human genomes, respectively. 
Clp/Hsp100 proteins can also function cooperatively 
with other chaperones, such as Hsp70, Hsp40, or 
GrpE. However, the mesophilic methanogen Metha- 
nosarcina mazei has a bacterial-type GroEL/GroES 
system that can substitute functionally for the E. coli 
components in vitro (53). 


CHAPERONES AND THERMOTOLERANCE 


Organisms exposed to a brief sublethal heat 
shock develop tolerance to otherwise lethal tempera- 
tures. This phenomenom is referred to as acquired 
thermotolerance. It has been well established in a 
diverse range of organisms (e.g., Drosophila, yeast, 
E. coli) that Hsp (heat shock protein) induction is re- 
sponsible for acquired thermotolerance. In the Ar- 
chaea, evidence for an adaptive thermotolerance re- 
sponse linked to chaperone expression was first 
discovered in the hyperthermophilic archaeal species, 
Sulfolobus shibatae (92, 93). Acquired thermotoler- 
ance was achieved following heat shock at 88°C for 
60 min, which enabled the cells to survive a normally 
lethal exposure at 95°C for 40 min (92). Acquired 
thermotolerance was accompanied by the synthesis of 
high levels of the chaperone Hsp60. 

Other environmental stressors, including treat- 
ment of yeast cells with 2.5% NaCl, 4% ethanol, or 
exposure to reduced pH (e.g., pH 5) induce thermo- 
tolerance (35). Cells exposed to sublethal levels of 
these types of stressors display increased synthesis of 
Hsps compared with nonstressed cells, demonstrating 
that Hsps can be involved in cellular responses to a 
variety of stressors in addition to heat shock. 

The gene for a putative AAA* homolog (NP_ 
579611) from the hyperthermophile, Pyrococcus fu- 
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riosus, is up-regulated following heat shock at 105°C 
by induction of a repressor protein, Phr (Heat shock 
regulator protein) (97) (see Chapter 6). A single phr 
gene is present in all three Pyrococcus genome se- 
quences (P. furiosus, P. abyssi, and P. horikoshii), and 
encodes a 24-kDa basic protein. In P. furiosus, the 
promoters of the heat shock inducible hsp20 (Pfu- 
shsp) and aaa* ATPase genes have highly conserved 
dyad operator sites (51). The expression of the phr 
gene was not induced by heat shock, suggesting that 
the Phr protein may be required at both normal 
growth and heat shock temperatures. Repression is re- 
lieved by an unknown mechanism during heat shock, 
and a cis-acting regulatory sequence has been de- 
scribed that may be important for heat shock regula- 
tion (97). Aligning the upstream regions of the AAA*- 
encoding genes from P. furiosus and P. abyssi enabled 
conserved regions to be identified which may be Phr- 
binding sites in both organisms (53). The promoter re- 
gion of the AAA* gene from P. abyssi also has phr 
recognition motifs similar to the promoter of the heat- 
inducible, small heat shock protein (shsp) gene from 
P. furiosus, indicating that these two species may be us- 
ing a common heat shock regulatory mechanism (45). 
Since the Clp/Hsp100 genes are absent in archaea 
(with the exception of the Methanosarcina spp.), it is 
possible that the AAA* proteins from hyperther- 
mophilic archaea may fulfill a similar functional role 
by combining with as-yet unidentified cochaperones. 
In this regard, the Lon protease from the thermophile, 
Thermoplasma spp., is an example of a naturally oc- 
curring fusion protein that contains a AAA* domain 
(4). The heat-inducible AAA* proteins (and unidenti- 
fied functional partner proteins) may be functionally 
analogous to the Clp proteases, GroEL/GroES (bacte- 
rial chaperonin), or DnaJ/K (bacterial chaperones). 
HtpX is a putative membrane-bound metallopro- 
tease in bacteria and is ubiquitous in archaea, although 
it is annotated in many archaeal genomes as a con- 
served hypothetical protein. One copy of the htpX gene 
is present in the genomes of P. furiosus and P. abyssi, 
and two copies are present in each of the genomes of 
P. horikoshii and S. solfataricus. Similar to AAA* pro- 
tein genes, htpX is heat inducible in M. jannaschii (7), 
Archaeoglobus fulgidus (17, 81), and P. furiosus (85) 
(P. Laksanalamai, J. DiRuggiero, F. Robb, and T. Lowe, 
manuscript in preparation). It is possible that htpX 
may have a cellular function that is similar to, or com- 
plements, the heat shock-inducible AAA* proteins. 


ORIGINS OF NSF AND HslUV 


The heat shock-inducible AAA™ proteins in the 
archaea are very distantly related to chaperone-asso- 


ciated AAA* modules in eucarya, but are homolo- 
gous to the yeast CDC48 proteins. CDC48 proteins 
are molecular chaperones that are crucial for correct 
cell division in eucarya, and regulate spindle disas- 
sembly following mitosis. The AAA* proteins from 
Sulfolobus species have high sequence identity (40 to 
55%) to the yeast CDC48 homolog and to the related 
human NSF proteins; NSF proteins modify the con- 
formations of specific integral membrane proteins. It 
seems likely that this has resulted from nonortholo- 
gous gene displacement (53). The AAA* domain pre- 
sent in a major class of heat shock-regulated chap- 
erones in archaea may have originated from a 
membrane-localized bifunctional chaperone, which 
was an ancestor of the modern NSF proteins. NSF 
and HslU are structurally similar proteases, and it is 
plausible that the archaeal homologs of NSF may 
have features that allow them to carry out HslU-like 
proteolytic functions (55). Either the heat inducible 
AAA‘ proteins, or the proteasome-associated AAAt 
subunit in archaea, could resemble the progenitor of 
the NSF function in eucarya, and might have been 
one of the core functions present in the “proto- 
eukaryote” (4). These chaperone-associated AAAT 
modules in eucarya could have arisen by the acquisi- 
tion of another AAA* module, introducing ATP hy- 
drolysis into an ancestral folding system that lacked 
ATPases. 


PREFOLDINS 


Prefoldins are universally present in eucarya and 
archaea, with similar structures, but are absent in 
bacteria. The prefoldins are “holdase” chaperones 
whose crystal structure was first resolved from the ar- 
chaeon Methanothermobacter thermautotrophicus 
(86, 91). The chaperone has been likened to a jellyfish 
in shape, with a globular “body” with six canonical, 
antiparallel coiled coils (the “tentacles”) with their 
N and C domains oriented outwardly from an 
oligomerization domain (Fig. 1). The coiled-coil “ten- 
tacles” form a cavity lined with hydrophobic patches 
that secure nonnative target proteins (59). In ar- 
chaea, with one exception, prefoldins are hexamers 
consisting of two a-subunits and four B-subunits, 
which act as generalized holding chaperones. The ar- 
chaeal prefoldins bind to a wide range of non-native 
proteins in vitro, although their intracellular sub- 
strates are not known. Although similar in overall 
structure, the eucaryal prefoldins consist of six non- 
identical subunits (two a-class and four B-class sub- 
units) and, in contrast to archaeal prefoldins, bind 
specifically to the ribosome-nascent forms of actins 
and tubulins (56). 
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Figure 1. Structure of the prefoldin from M. thermautotrophicus. 
The globular body and coiled coils extensions in the “jellyfish” 
model form an adjustable cage that accommodates and binds pro- 
teins by a clamp mechanism (59). Reproduced from Nature Re- 
views Microbiology (53) with permission of the publisher. 


Several recent lines of evidence indicate that pre- 
foldins can act cooperatively with chaperonins, such as 
HSP60, and load nonnative proteins into their cavity. 
The prefoldin tentacles are capable of flexing outward 
to accommodate both small (14 kDa, lysozyme) and 
large (62 kDa, firefly luciferase) proteins in the cavity 
formed by the “tentacles” to prevent their aggregation 
(56, 59). The holding-and-release mechanism of the 
archaeal prefoldins has recently been elucidated (72, 
107). In addition, the transfer of nonnative substrates 
to chaperonins has been well characterized using sur- 
face plasmon resonance and takes place between the 
prefoldins and chaperonins from one species of Pyro- 
coccus, but not when chaperones from different Pyro- 
coccus species are used (72). The hyperthermophilic 
methanogen, M. jannaschii, encodes genes for the a- 
and B-subunits of prefoldin. However, a unique third 
prefoldin subunit is encoded by the pfdy gene and is 
heat shock regulated, unlike the a- and B-subunits (7, 
28, 34, 68). This system raises new questions regarding 
the functional assignments of this heat shock-inducible 
prefoldin and the sHSPs, since they have overlapping 
chaperone activities in vitro. 


SMALL HEAT SHOCK PROTEINS 


Putative shsp genes are present in all archaeal 
genome sequences including N. equitans. Both the 
sHSPs and vertebrate a-crystallins are holdase-type 
molecular chaperones (28, 34, 68). The sHSPs and 


a-crystallins have a monomeric molecular weight 
range of 15 to 40 kDa and typically form polydis- 
perse multimeric complexes in vivo. However, in the 
Archaea biochemical characterization is limited to 
thermophilic and hyperthermophilic organisms. 

Only two crystal structures of sHSPs from unre- 
lated organisms, M. jannaschii (42) and Triticum aes- 
tivum (wheat) (95), have been reported. The sHSPs 
share amino acid sequence similarity with the central 
core of vertebrate eye lens a-crystallin proteins, which 
are conserved in this family of proteins through all 
domains of life. The sHSP proteins have relatively 
low amino acid sequence similarity, and their quater- 
nary structures are dissimilar, but the monomeric 
structures of these proteins are almost identical. Their 
specific functional mechanisms may be determined by 
their individual quaternary structures and their cog- 
nate target proteins and chaperone partners. The ar- 
chaeal sHSPs can prevent denatured proteins from 
aggregating under strong denaturing conditions, and 
in some cases, are able to refold denatured proteins 
(43, 51, 82, 94). The sequences of the N- and C- 
terminal domains of archaeal sHSPs differ, and this 
variability is responsible for the great variety of mul- 
tisubunit structures that form. The N-terminal do- 
main of the M. jannaschii sHSP16.5 is disordered in 
the crystal structure, but low-resolution features 
have been resolved by cryoelectron microscopy. This 
domain is essential for proper holdase function in 
sHSP16.5 (41). 

The copy number of sHSP-encoding genes is 
variable among archaeal species. The thermophilic 
and hyperthermophilic archaea contain one, two, or 
three shsp homologs. Hyperthermophilic species 
growing optimally near 100°C have one shsp gene, 
with the exception of Pyrobaculum aerophilum, 
which has two homologs (51). Thermoplasma aci- 
dophilum and all the Sulfolobus spp. represented by 
genome sequences each have three shsp homologs. 
However, one of the sHSPs in T. acidophilum appears 
to have domains that are similar to the two ATPase 
domains of ArsA from E. coli (83). Sulfolobus solfa- 
taricus and S. tokodaii have one 14- to 15-kDa and 
two 20- to 21-kDa sHSPs each. The mesophilic 
methanogens M. acetivorans and M. mazei GoE1 
contain three and four shsp homologs, respectively. 
However, one of the two sHSPs from M. acetivorans 
(NP_619401) does not appear to belong to the a- 
crystalline-type HSPs. The genome sequence of Halo- 
bacterium NRC-1 has the highest paralogy, encoding 
five sHSPs that all clearly belong to the a-crystallin 
family. It seems likely that the multiple sHSPs en- 
coded in a single species perform a range of poten- 
tially overlapping cellular functions; however, this has 
not been assessed experimentally. 
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The role of sHSPs in protein folding is still a 
topic of active investigation in both archaea and eu- 
carya. They can maintain solubility of nonnative pro- 
teins under physiological conditions indefinitely, for 
example, in the eye lens, displaying a remarkable ca- 
pacity for binding nonnative target proteins present 
in greater concentration than the chaperones. The 
binding capacity of eucaryal a,8-crystallins for non- 
native proteins is greatly stimulated by serine phos- 
phorylation of the sHSP, and the dynamic reordering 
of sHSP complexes is required for solubilization of 
nonnative proteins (84). Although archaeal systems 
for protein phosphorylation have been described (see 
Chapter 11), it is unknown whether the archaeal 
chaperones are phosphorylated. 

Recently, reconstitution of a protein-refolding 
pathway in vitro was described (52). Denatured Taq 
polymerase was reactivated cooperatively at 100°C 
by a mixture of sHSP or prefoldin with HSP60 from 
P. furiosus in an ATP-dependent folding pathway. 
The cooperative protein salvage pathway depends on 
the presence of HSP60 and ATP for full activity 
(Fig. 2). The sHSPs (Fig. 2A) and prefoldins (Fig. 2B) 
appear to fulfill similar roles in this system, namely to 
transfer denatured proteins to the HSP60 chaperonin, 
although the differences in their structures suggest 
that they transfer nonnative proteins to chaperonins 
by different mechanisms. The rate of refolding of Taq 
polymerase was minimal when just the holdase chap- 
erones were present and was greatly increased when 
HSP60 and ATP were present. 


NAC PROTEINS 


The nascent polypeptide associated complex 
(NAC) was first isolated from bovine brain cytosol 
and recognized as a molecular chaperone. Multiple 
subunits of NAC were first characterized in yeast, and 
formation of NAC complexes with ribosomes ap- 
pears to be critical for folding and export of eucaryal 
proteins (76). The mechanism of eucaryal NAC com- 
plexes is not well understood; several hypotheses ex- 
ist, and studies are ongoing. The hypothesis that 
NAC proteins prevent inappropriate interaction be- 
tween newly synthesized polypeptide chains and 
other cellular factors appears to be well supported. 
NAC functions in archaea may be similar to bacter- 
ial trigger factor (TF), since TF homologs are absent 
from all archaeal genomes. NAC proteins have been 
shown to be involved in translational control and lo- 
calization of Oskar mRNA. Unlike eucarya, archaea 
do not have multiple NAC subunits. The recent char- 
acterization of a NAC protein from S. solfataricus 
(89) revealed that it is a homodimer. The monomers 


have two domains, formed by the N- and C-terminal 
regions of the protein. The N-terminal domain is ho- 
mologous to NAC a-subunits in eucarya. All archaeal 
NAC proteins in archaea have a C-terminal ubiqui- 
tin associated (UBA) domain. At present, the hypoth- 
esis that the complex interacts with ubiquitin is spec- 
ulative (89). Putative ubiquitin homologs occur in 
several archaeal genomes but are missing from several 
others, and consequently the UBA domain, which is 
strongly conserved in all archaeal genomes, may have 
functions unrelated to ubiquitin binding. 


THE GROUP II CHAPERONINS IN 
ARCHAEA: MECHANISTIC INSIGHTS 
FROM MINIMALISTIC CHAPERONINS 


The chaperonins are ubiquitous molecular chap- 
erones that form double-ring assemblies of subunits 
with a molecular mass of 60 to 70 kDa. The result- 
ing structures have a large central cavity where non- 
native proteins can undergo productive folding in an 
ATP-dependent manner (9, 27). The paradigm for 
chaperonin-assisted protein folding has been the 
group I GroE system from E. coli, consisting of the 
GroEL chaperonin and associated GroES cochaper- 
onin, which are generally found in bacteria and eu- 
caryal organelles of bacterial origin (27, 87). 


STRUCTURE AND SUBUNIT COMPOSITION 
OF ARCHAEAL GROUP II CHAPERONINS 


The archaeal group II chaperonins form toroidal 
double rings with an eight- or ninefold symmetry, 
consisting of homologous subunits (25). The archaeal 
chaperonins are composed of up to five sequence- 
related subunits. Sulfolobus species (2, 37), Haloferax 
volcanii (54), M. mazei (45), and M. burtonii (21; 
R. Cavicchioli, personal communication) contain three 
chaperonin genes. Table 2 lists the number of sub- 
units per genome and subunit composition of chap- 
eronins from characterized members of the Archaea. 
Recently, it was found that there are five chaperonin 
subunits (Hsp60-1, -2, -3, -4, and -5) in M. acetivo- 
rans. Among them, Hsp60-1, Hsp60-2, and Hsp60- 
3 have orthologs in Methanosarcinaceae, but others, 
Hsp60-4 and Hsp60-5, occur only in M. acetivorans. 
The HSP60-4 and Hsp60-5 paralogs may represent 
the third class of chaperonin that may be ancestral to 
two widely distributed group I and II orthologs (62). 

The subunit composition of the chaperonin com- 
plexes in several archaea changes with growth tem- 
perature (32, 37, 102). The chaperonin from the hyper- 
thermophilic archaeon Thermococcus sp. strain KS-1 
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Figure 2. Effect of chaperones sHsp and Hsp60 on the thermostability of Taq DNA polymerase in the presence of P. furiosus 


molecular chaperones. (A) Inactivation of Taq polymerase in the presence of individual subunits of sHsp (A), Hsp60 (0O), 


Hsp60-Mg**-ATP (WM), sHsp and HSP60 (©), and sHsp and Hsp60-Mg?*-ATP ( @). The controls are reactions without the 
addition of chaperones (O) and with the addition of Mg?* and ATP (@). (B) Inactivation of Taq polymerase in the presence 


of individual subunits of prefoldin, prefoldin a (A) and B (A), prefoldin complex (%), Hsp60 ( 


), Hsp60-Mg?*-ATP (W), 


prefoldin and HSP60 (©), and prefoldin and Hsp60-Mg?*t-ATP (@). The controls are reactions without the addition of 
chaperones (O) and with the addition of Mg?* and ATP (@). Reproduced from Biotechnology and Bioengineering (52) with 


permission of the publisher. 
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Table 2. Archaeal chaperonin: number of subunits encoded per genome 


Organisms Subunit species? Rotational symmetry” Reference 
Crenarchaeota 
Aeropyrum pernix 2 (a, B) NR 39 
Pyrobaculum aerophilum 2 (a, B) NR 67, 75 
Pyrodictium occultum 2 (a, B) 8 36 
Sulfolobus acidocaldarius 3 (a, B, Y) NR 2 
Sulfolobus shibatae 3 (a, B, Y) 9 2375 
Sulfolobus solfataricus 3 (a, B, y) 9 37, 46 
Sulfolobus tokodaii 3 (a, B, y) NR 2 
Euryarchaeota 
Archaeoglobus fulgidus 2 (a, B) 8 17-73 
Halobacterium sp. NRC-1 2 (a, B) NR 70 
Haloferax volcanii 3 (CC, T1, 2, 3) NR 54 
Methanocaldococcus jannaschii 1 NR 47 
Methanococcus thermolithotrophicus 1 8 18 
Methanococcus maripaludis 1 NR 50 
Methanopyrus kandleri 1 8 1, 67 
I 1 NR 62 
Il: 5 (Hsp60-1, -2, -3, -4, -5) NR 
Methanosarcina acetivorans I 1 NR 62 
Methanosarcina barkeri I: 3 (1, 2, 3) NR 
Methanosarcina mazei 1 NR 45 
I: 3 (a, B, y) 8? 
Methanothermobacter thermautotrophicus 2 (a, B) NR 88 
Picrophilus torridus 2 NR 19 
Pyrococcus abyssi 1 NR www.genoscope.cns.fr/Pab/ 
Pyrococcus furiosus 1 NR 79 
Pyrococcus horikoshii 1 NR 72 
Thermoplasma acidophilum 2 (a, B) 8 71, 98 
Thermoplasma volcanium 2 (a, B) NR 40 
Thermococcus kodakaraensis 2 (a, B) NR 33,101 
Thermococcus sp. strain KS-1 2 (a, B) 8 104, 106 
4Methanosarcina acetivorans, M. barkeri, and M. mazei contain both group I (refer to “I”) and group II (refer to “II”) chaperonins. NR, not reported (72). 


is composed of two highly sequence-related subunits, 
a and B (106), that form a heterooligomer with vari- 
able subunit composition in vivo (102). Expression of 
a- and B-subunits is regulated differently, and only 
the a-subunit is thermally inducible (102). The pro- 
portion of the a-subunit in Thermococcus KS-1 chap- 
eronin increases with temperature, and the B-subunit- 
rich chaperonin is more thermostable than the 
a-subunit-rich chaperonin (103). The hyperther- 
moacidophilic archaeon, S. shibatae, contains group 
II chaperonins composed of up to three different sub- 
units (a, B, and y). Expression of the a- and B-sub- 
units is increased by heat shock and decreased by cold 
shock (37). On the other hand, expression of the 
y-subunit gene is undetectable at heat shock temper- 
atures and low at normal growth conditions, but in- 
duced by cold shock (22, 37). A cold-adaptation re- 
sponse in M. burtonii has also been studied by using 
ICAT proteomic profiling (22). The halophilic ar- 
chaeon H. volcanii has three group II chaperonins 
genes, cct1, cct2, and cct3, which are all expressed 
but to differing levels (54). Deletion of cct3 has no 
effect on the activity of the chaperonin complex, but 


loss of cct1 leads to ~50% reduction in the purified 
chaperonin ATPase activity (58). The precise func- 
tional properties and physiological significance of the 
heterologous subunit composition of archaeal group 
II chaperonin subunits is still the subject of active 
investigation. 

The prototype crystal structure of the group II 
chaperonin is shown in Fig. 3. This structure from the 
thermoacidophilic archaeon, T: acidophilum, has 
shown that the subunit architectures are very similar 
to group I chaperonins, except for differences in the 
helical protrusion region (8, 13, 44). The helical pro- 
trusion in group II chaperonins may provide a func- 
tional role equivalent to the GroES subunit of group 
I chaperonins by sealing off the central cavity of the 
chaperonin complex (29, 64) (Fig. 3). 


PROTEIN-FOLDING MECHANISM OF 
ARCHAEAL GROUP II CHAPERONINS 


Although nucleotide and amino acid sequences 
of many archaeal chaperonins have been reported, 
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Figure 3. (See the separate color insert for the color version of this illustration.) Structure of archaeal group II chaperonin 
from Thermoplasma acidophilum. (A and B) The side view and top view of the crystal structure of T. acidophilum chaper- 
onin, respectively. The a subunits are shown in dark green, and the B subunits are shown in dark blue. The hexadecameric 
structure was drawn using MOLSCRIPT (48). (C) The subunit structure of T. acidophilum chaperonin. Apical, intermediate, 
and equatorial domains are represented by green, blue, and red, respectively. The helical protrusion is highlighted by yellow. 
The figure was drawn with the Viewer Light 5.0 software (Accelrys). 
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there are comparatively few reports on their func- 
tional characterization; these include the native chap- 
eronin from S. solfataricus (23, 24) and Thermococ- 
cus KS-1 (18, 102), and recombinant chaperonins 
from Methanococcus thermolithotrophicus (18), Py- 
rococcus horikoshii (72), and Methanococcus mari- 
paludis (5, 49, 50), T. acidophilum (5), andThermo- 
coccus KS-1 (106). The group II chaperonin from 
Thermococcus KS-1 has been studied in most detail. 
The Thermococcus KS-1, a- and B-subunits coassem- 
ble to form double-ring homooligomers (a-chaper- 
onin and B-chaperonins, respectively), and are able to 
capture denatured proteins and fold them in an ATP- 
dependent manner in vitro (105, 106). Taking advan- 
tage of this, significant progress has been made in 
defining functional mechanisms (29-31, 105) (see 
“Role of the helical protrusion” and “Functional 
asymmetry,” below). The known properties, arrest 
and ATPase activity, and structural characteristics of 
archaeal chaperonins appear in Table 3. 


ROLE OF THE HELICAL PROTRUSION 


The helical protrusion is strictly conserved among 
group II chaperonins (25). From crystallographic 
studies (13, 44), the region was thought to function as 
a substitute for the cochaperonin groES of the group 
I chaperonins and to be important for binding to un- 
folded proteins. To elucidate the exact role of the he- 
lical protrusion of a group II chaperonin in its mole- 
cular chaperone function, three deletion mutants of 
Thermococcus KS-1 a-chaperonin were constructed, 
lacking one third, two thirds, and the whole of the 
helical protrusion, respectively. Protease sensitivity 
assays and small-angle X-ray scattering (SAXS) ex- 
periments were performed to examine the conforma- 
tional changes of the wild-type and mutant proteins. 
While the binding of ATP to the wild-type protein in- 
duced a structural transition corresponding to the 
closure of the built-in lid, it did not cause significant 
structural changes in the mutant proteins. Although 


Table 3. Structural and functional characteristics of archaeal group II chaperonins 


Subunit Native or 


Rotational ATPase Arrest Folding 


Organism . f s H ie Reference 
species recombinant symmetry activity activity? activity 
Crenarchaeota 
Sulfolobus shibatae 3 Native 9 Trace af NR 37:93 
Sulfolobus solfataricus 3 Native 9 Trace + + 2, 23, 24, 46, 63 
Sulfolobus tokodaii 3 Native NR Trace a = 2, 69 
Recombinant (a, B)? NR Trace + = 
Pyrodictium brockii NR’ Native 8 NR NR NR 75 
Native 8 a NR NR 66, 75 
Pyrodictium occultum 2 Recombinant (a, B) 8 + + NR 
Recombinant (a+ ß)° 8 + + NR 
Euryarchaeota 
Archaeoglobus fulgidus 2 Native 8 NR NR NR 17, 75 
Haloferax volcanii 3 Native NR + NR NR 54 
Methanococcus jannaschii jl Native NR + + a 47 
Methanococcus yi Recombinant? 8 + + + 18 
Methanococcus maripaludis 1 Recombinant NR + + F 50 
Methanopyrus kandleri 1 Native 8 T NR NR 1;3 
Recombinant 8 + + NR 
(5) (5)3 Native 8 (a: B: y NR NR NR 45 
= 2:1:1) 
Recombinant 8 (ay) + + (aBy)§  —(aßy) 
Pyrococcus horikoshii 1 Recombinant NR te + + 72 
Native 8 Trace F NR 5,71, 98 
Thermoplasma acidophilum 2 Recombinant (a, B) 8 + NR NR 
Recombinant (a+ 8) 8 + Trace Trace 
Thermococcus kodakaraensis 2 Recombinant (a, B) NR + +(B)? NR 33, 101 
Themococcus sp. strain KS-1 2 Native NR + + F 104, 106 
Recombinant (a, B) 8 + + + 


a“ Arrest activity” means the binding activity to nonnative proteins. 
Pa- and B-subunits are separately expressed in E. coli and purified. 
ĉa- and B-subunits are coexpressed in E. coli and purified. 

4The measurement is carried out at 30°C. 

*Reconstituted complex of purified subunit. 

‘Reconstituted complex of a- and y-subunits. 

£Reconstituted complex of a-, B-, and y-subunits. 


Purified B-subunits prevent thermal inactivation of yeast alcohol dehydrogenase. 


‘NR, not reported. 
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the mutants effectively protected proteins from ther- 
mal aggregation, the ATP-dependent protein-folding 
ability was remarkably diminished. The results indi- 
cate that the helical protrusion is not necessarily im- 
portant for binding to unfolded proteins, but its ATP- 
dependent conformational change mediates folding of 
captured unfolded proteins (29, 31). 


FUNCTIONAL ASYMMETRY 


Although many studies have been carried out on 
the lid conformation, the steps of the ATPase cycle re- 
mained obscure until recently. The pathway was re- 
vealed by determining the effects of ADP-beryllium 
fluoride (BeF,.) complex formation on the Thermo- 
coccus KS-1 a chaperonin (30). Biochemical assays, 
electron microscopic observations, and SAXS mea- 
surements demonstrate that a-chaperonin incubated 
with ADP and BeF, exists in an asymmetric confor- 
mation; one ring is open, and the other is closed. The 
binding of ADP under conditions of inhibition of 
ATPase activity by BeF, resulted in freezing the alter- 
nation of conformational changes in the complex. 
The result indicates that a-chaperonin shares the in- 
herent functional asymmetry of bacterial (9, 27) and 
eucaryal cytosolic chaperonins (64, 65). 

Even though there is a difference between ar- 
chaeal and eucaryal group II chaperonins, ATP bind- 
ing is sufficient to close the lid of archaeal chaper- 
onins (30), while the lid closure of CCT is triggered 
by the transition state of ATP hydrolysis (64). 

Addition of ADP and BeF induced the Thermo- 
coccus KS-1 a-chaperonin to encapsulate unfolded 


proteins in the closed ring but did not trigger their 
folding. Moreover, the a-chaperonin incubated with 
ATP and BeF, adopted a symmetric closed conforma- 
tion, and its functional turnover was inhibited. The 
existing evidence indicates that asymmetric and sym- 
metric molecules are present in the functional ATPase 
cycle of archaeal group II chaperonins (30). 

A schematic model for the functional mechanism 
of archaeal group II chaperonins is depicted in Fig. 4 
(30). In the absence of nucleotides, the chaperonin is 
maintained in the open conformation and captures 
nonnative polypeptides. Although the exact location 
of the substrate-binding site has not been determined 
experimentally, it is likely that nonnative substrates 
interact via the exposed hydrophobic surface of the 
apical domain. This is supported by the finding that 
the helical protrusion region is not required for the 
recognition and binding of the substrate protein. Sub- 
sequently, the binding of ATP leads to a conforma- 
tional change to the asymmetric structure, in a similar 
manner to ATP and GroES binding with the substrate- 
bound ring of bacterial GroEL. In contrast to the 
GroEL/ES model, protein folding is not induced by 
the closing of the lid, and further conformational 
change seems to be required. The asymmetric confor- 
mation changes to the symmetric closed conforma- 
tion when ATP binds to the other ring. The substrate 
is released from the cavity wall into the hydrophilic 
central cavity, where productive folding occurs (Fig. 4). 
Consequently, the release of the y-phosphate gener- 
ated by ATP hydrolysis in the folding active ring trig- 
gers the opening of the lid and release of the substrate 
(Fig. 4). The two rings of group II chaperonins may 
alternate as a folding chamber, similar to GroEL, and 


ATP binding ATP hydrolysis Pi release 
(ADP-P) (ADP-Pi) (ADP) 
- --_— «a... —-,- - — q 
Nonnative wO 
H polypeptide 1 
nner ATP Pi 
| | ZY i 
—— n 
ATP 


(1) 


ATP binding 
(ADP-P) 


ATP hydrolysis 
(ADP-Pi) 


(3) (4) 


Figure 4. Schematic model for the reaction mechanism of archaeal group II chaperonins. See text for details. A, I, and E refer 
to the apical, intermediate, and equatorial domains, respectively. H represents the helical protrusion. Reproduced from Journal 


of Biological Chemistry (104) with permission of the publisher. 
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the archaeal group II chaperonins may have a molec- 
ular mechanism similar to GroEL and its partner 
GroES. The cyclic switching of chaperonin chambers 
may be compared with the compression/decompres- 
sion cycles of a two-stroke gasoline engine. 


OCCURRENCE OF GROUP I CHAPERONINS 
IN METHANOSARCINA SPECIES 


The genome sequences of several Methano- 
sarcina species have been completed, for example 
M. mazei (12) and M. acetivorans (21), and have re- 
vealed that these species contain both group I and II 
chaperonin genes (45, 61). In M. mazei, both chaper- 
onin complexes are formed from chaperonin oligomers 
that are expressed from genes that are moderately in- 
duced by heat stress (45). How group I and II chap- 
eronins share the protein-folding duties in the cell is 
a very interesting question that is open to speculation. 
In Methanosarcina species, it is proposed that 20 to 
35% of genes were acquired horizontally from bac- 
teria, including group I chaperonin genes. The coex- 
istence of both groups of chaperonins in the same cy- 
tosol in the Methanosarcina species provides a useful 
model system for studying the differential substrate 
specificities of the group I and II chaperonins, and for 
elucidating how newly synthesized proteins are sorted 
from the ribosome to the appropriate chaperonin for 
folding (45, 62). 


PERSPECTIVE: THE NEXT FIVE YEARS 


Molecular chaperones are diverse and eclectic, 
and every cell carries a unique repertoire of different 
independent or cooperative protein-folding machines. 
In the Archaea, chaperones that are stress inducible 
have received the most attention thus far. Perhaps the 
most understandable cellular functions of chaperones 
occur during cell stress by salvaging nonnative pro- 
teins and recruiting them to join the pool of stable 
proteins, to prevent their demise as intracellular ag- 
gregates. In the eucarya and bacteria, chaperones are 
also known to participate in many fundamental cel- 
lular processes in nonstressed cells including DNA 
replication, regulation of gene expression, cell divi- 
sion, membrane translocation, protein folding, and 
protein remodeling (100). For example, the ClpA 
proteins in bacteria can mediate protein folding, un- 
folding, assembly, and disassembly without them- 
selves being part of the final complex (74). Open 
questions remain regarding the archaeal protein- 
folding systems as to how posttranslational model- 
ing functions are partitioned between the known 


chaperones and chaperones or cochaperones that 
have not yet been discovered. 

In the case of chaperonin and prefoldin, the com- 
pact nature of the protein-folding pathways in normal 
and stressed cells has been accomplished by double- 
duty assignments of these chaperones to normal, as 
well as stress-responsive protein-folding pathways (45). 
The mechanisms of protein folding in more complex 
eucaryal pathways have become accessible through the 
analysis of chaperones from archaea, due to their sim- 
ple architecture and exceptional stability. New insights 
into posttranslational processing and protein salvage 
will very likely emerge in the next five years as new de- 
velopments, such as tractable genetic systems (see 
Chapter 21), are more widely applied in the archaea. 

One aspect of protein trafficking in archaea that 
has received little attention is the decisive mechanism 
that sends irreversibly misfolded proteins to their pro- 
teolytic fates. The proteasome has been characterized 
with respect to the PAN (proteasome-activating nu- 
cleotidase) system. PAN serves as a sensor for un- 
folded proteins. Binding activates ATP hydrolysis, 
which stimulates substrate unfolding, gate opening 
in the 20S complex, and protein translocation (3). 
Pan A and Pan B proteins prime entry into the 20S 
complex and the PAN regulatory complex, a ho- 
molog of the eucaryal 19S ATPase (77). However, the 
mechanisms of PAN action in archaea are still the 
subject of active research. The complex function of 
protein entry into the proteasome is controlled in eu- 
carya by ubiquitinylation, a prerequisite for proteoly- 
sis by the proteasome. While all archaea have puta- 
tive ubiquitin-binding domains in the NAC proteins 
(89), the factors controlling the gating of proteins 
into the proteasome and the restraints exerted on na- 
tive proteins, preventing their entry into the protein 
turnover process, have not been determined. This 
enigma in archaeal molecular biology is very likely 
to be addressed and solved within the next five years. 
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Chapter 11 


Sensing, Signal Transduction, and Posttranslational Modification 


PETER J. KENNELLY 


INTRODUCTION 


The ability to coordinate molecular functions 
and modulate them in response to changes in the sta- 
tus of both its internal milieu and external environ- 
ment is essential to the continuation of any and all life 
forms (Fig. 1). Without recourse to regulatory con- 
trols, adaptations, and checkpoints an organism will 
succumb, sooner or later, to the exhaustion of the raw 
materials needed to sustain existence or an ill-timed 
leap into a resource-intensive and demandingly com- 
plex process such as cell division. Hence, all contem- 
porary life forms, no matter how “simple,” “ 
tive,” or “ancient” they may appear, must contain a 
basic suite of sensor-response machinery (Table 1). 

The extremophilic organisms that dominate the 
domain Archaea have challenged long-established 
concepts regarding the parameters that define a hab- 
itable environment. How do the Archaea sense and 
respond to the extremes of acidity, temperature, salin- 
ity, pressure, etc. that characterize their distinctive en- 
vironmental niches? Have the Archaea developed 
unique mechanisms for regulating the many novel 
metabolic pathways to which they are host? Do ar- 
chaea communicate with one another or with other 
organisms? 

At this point in time, we know comparatively lit- 
tle regarding molecular regulatory mechanisms in the 
Archaea. The mechanisms that control archaeal gene 
transcription and chemotaxis are described in Chap- 
ters 6 and 18, respectively. The present chapter fo- 
cuses on other modalities of signal transduction, in- 
cluding intracellular second messengers, feedback 
regulation, and posttranslational modifications such 
as the phosphorylation-dephosphorylation of pro- 
teins (Fig. 2). 


primi- 


RECEPTORS 


How “Sensitive” Are the Archaea? 


Do the Archaea possess the extensive sensor- 
response machinery typical of free-living Bacteria 
and Eucarya? Given the unique nature of the habi- 
tats occupied by the first recognized members of the 
third domain of life, one’s first instinct might be to 
say no. Wouldn’t the hostility of these extreme envi- 
ronments toward conventional life forms suggest 
that their archaeal inhabitants were effectively quar- 
antined from the rest of the biosphere? Moreover, the 
unequivocal nature of the descriptor “extreme” im- 
plies an inherent constancy. When viewed from our 
own terrestrial frame of reference, these environ- 
ments in general appear to be invariantly extreme in 
their acidity, salinity, temperature, etc. The sensor- 
response needs of an organism living in such physi- 
cally monotonous and biologically sterile niches there- 
fore would be expected to be minimal in extent and 
primitive in nature. 

However, the Archaea are not ecological hermits. 
With the development of sensitive molecular probes 
such as the polymerase chain reaction, it has become 
apparent that archaea permeate the biosphere. Con- 
versely, we now realize that a phylogenetically diverse 
range of organisms populate the extreme habitats for- 
merly regarded as homogenous archaeal ghettoes. It 
is therefore not surprising to find that the Archaea re- 
spond to a wide range of environmental factors, in- 
cluding the type and availability of sources of carbon, 
energy, and essential elements, as well as variations 
in the pH, salinity, and temperature of their sur- 
roundings (Table 2). However, while it is clear that 
archaea are “sensitive,” many fundamental questions 
remain. What types of sensor-response mechanisms 
have these phylogenetically distinct, and oftentimes 
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Figure 1. Environmental variables and internal cues known or 
likely to be monitored by members of the Archaea. 


metabolically unique, organisms developed to cope 
with life at terrestrial extremes? What can the Archaea 
tell us about the origins and evolution of molecular 
sensor-response pathways across the phylogenetic 
spectrum? 


Do Archaea Communicate with Their Neighbors? 


Little is known regarding cell-cell communica- 
tion systems in the Archaea. Many archaea populate 
syntrophic microbial consortia (215, 221), implying 
the existence of both intraspecies and interspecies sig- 
naling mechanisms. Empirical studies of archaeal-ar- 
chaeal and archaeal-bacterial communication have 
been few in number and preliminary in nature (131, 
193, 222). Inspection of archaeal genomes has re- 
vealed them to be devoid of homologs of the proto- 
typic bacterial quorum-sensing proteins LuxS and 
LuxR (286). 


Archaeal Receptors 


While it is clear that the Archaea are responsive to 
changes in their surroundings, our knowledge of the 
specific environmental cues that are monitored and the 
receptors that recognize them remains limited. Most of 
the archaeal receptors characterized to date fall into 
five broad functional classes: sensors for amino acids, 
sensors for light energy, sensors for diatomic gases, 
sensors for osmotic stress, and sensors of electrical po- 
tential. In addition, bioinformatic analyses have identi- 
fied several proteins whose topography suggests that 
they function as transmembrane receptors. 


Amino acids and betaines 


The haloarchaeon, Halobacterium salinarium, 
employs three different types of receptors to detect 
amino acids and derivatives thereof. HtrlIl is a classic 
transmembrane receptor consisting of an extracellular 
serine-binding domain fused to an intracellular trans- 
mitter domain, specifically a methyl-accepting chemo- 
taxis protein (MCP) (117). By contrast, BasB, a re- 
ceptor for branched-chain and sulfur-containing 
amino acids, and CosB, which binds the modified 
amino acid betaine, also known as N,N,N-trimethyl- 
glycine, lack either membrane domains or fused sig- 
naling domains. It has been postulated that these pro- 
teins are tethered to the cell surface by lipid anchors, 
where they interact with the extracellular domains of 
their cognate MCPs, BasT and CosT (154). The halo- 
archaeal arginine receptor Car, on the other hand, is a 
soluble MCP-fusion protein that resides within the 
cytoplasm. Car relies on the activity of the arginine:or- 
nithine antiporter to deliver ligands to its vicinity (285). 
Each of these receptors triggers a two-component sig- 
naling cascade that regulates the activity of flagellar 
motor proteins (see “Two-component system” below, 
and Chapter 18). Examination of archaeal genome 
sequences has revealed that several additional archaea 
contain deduced MCP-associated chemoreceptors, in- 
cluding Archaeoglobus fulgidus, Haloarcula maris- 
mortui, Methanococcus maripaludis, Methanosarcina 
acetivorans, Methanosarcina mazei, Pyrococcus abyssi, 
and Pyrococcus horikoshii (95, 289). 


Light 


Many haloarchaea contain a brace of sensory 
rhodopsins, designated SRI and SRII, each of which is 
configured to sense a specific color of light via the 
photo-induced isomerization of retinal (115) (see 
Chapter 14). SRI is activated by orange light, while 
SRII is sensitive to blue-green wavelengths. Both sen- 
sory rhodopsins are coupled to two-component sig- 
naling cascades via their cognate MCPs, HtrI and 
Htrll, respectively. Intriguingly, HtrII also serves as 
the chemoreceptor for the amino acid serine (see 
“Amino acids and betaines,” above). 


Oxygen and other gases 


The Archaea employ a variety of heme proteins 
as receptors for oxygen and, potentially, other di- 
atomic gases such as NO and CO. The aerotactic be- 
havior of H. salinarium, for example, is mediated by 
two heme-containing proteins designated HtrVIII and 
HemAT-Hs. HtrVIII is an integral membrane protein 
in which a heme-containing cytochrome oxidase-like 


Table 1. Definitions of terms and abbreviations used in Chapter 11 


Allosteric regulation. The modulation of the functional properties of a protein via the binding of a molecule (allosteric effector) to a site 
distinct from that at which the function takes place. The association of an allosteric effector with its protein target is noncovalent and, 
hence, reversible. As the gross concentration of an allosteric effector decreases, it will dissociate from its cognate target protein. 


ATCase. Aspartate transcarbamolyase. 

CACHE domain. A deduced extracellular domain conserved among certain calcium channels and chemotaxis receptors. 
cAMP. The second messenger 3’,5'-cyclic AMP. 

c-diGMP. The second messenger bis (3’,5’-cyclic diguanylic acid). 

cGMP. The second messenger 3',5’-cyclic GMP. 


CHASE domain. A general designation for a series of deduced cyclase/histidine kinase-associated sensing extracellular domains. 
Numbers are used to designate the various classes of CHASE domains, i.e., CHASE1, CHASE2, etc. 


Covalent modification. Any structural change involving the formation and/or rupture of a covalent bond. Examples include 
phosphorylation, proteolysis, glycosylation, and disulfide formation. 


Downstream and upstream. ‘Terms used to define the relative positions of the molecular components within a signal transduction 
cascade. Downstream refers to a frame of reference proceeding from the signal to the target, and upstream refers to the converse. 


EF-2. Elongation factor 2. 


ePK. The “eukaryote-like” protein kinases. The most prolific family of protein-serine/threonine/tyrosine kinases in nature. The 
designator derives from the fact that the members of this family were thought, for many years, to be confined exclusively to members of 
the Eucarya (“eukaryotes”). 


Feed-forward regulation. Feed-forward activation is that form of feedback regulation in which an increase in the level of the indicator 
metabolite leads to the stimulation of the activity of a target enzyme. 


Feedback inhibition. Feedback inhibition is that form of feedback regulation in which an increase in the level of the indicator 
metabolite triggers a decrease in the activity of a target enzyme. 


Feedback regulation. Regulation of an enzyme catalyzing one of the early committed steps in a pathway or process in which an end 
product of the pathway, or some closely related pathway, serves as an indicator metabolite. 


GAF domain. A cyclic nucleotide-binding domain. The acronym GAF is derived from the names of some of the proteins in which it 
appears: CGMP-stimulated phosphodiesterase, Anabaena adenylate cyclase, and bacterial transcription factor FhlA. 


GAPN.  Glyceraldehyde-3-phosphate dehydrogenase. 
GDH. Glutamate dehydrogenase. 


Hierarchical regulation. Numerous proteins are targeted by multiple transmission domains, covalent modifications, second messengers, 
and/or indicator metabolites. In those cases where these regulatory modalities act in something other than a simple, additive manner, the 
overall regulatory mechanism is termed hierarchal. 


Histidine kinase (domain). A protein kinase that autophosphorylates on a conserved histidine residue. 
HPr. Histidine-rich protein. 


Hpt domain. Histidine phosphotransfer domain. A signal transmission unit sometimes encountered in extended two-component signal 
transduction cascades. Hpt domains are phosphorylated on a conserved histidine residue. The predominant function of Hpt domains is 
to shuttle phosphoryl groups between response regulator domains. 


Hybrid histidine kinase. A polypeptide that contains, in addition to a histidine kinase domain, one or more response regulator and/or 
Hpt domains. 


IF-5A. Initiation factor 5A. 
Indicator metabolite. A metabolite that also serves as an allosteric effector. 


KTN domain. KTN stand for K* transport, nucleotide-binding domain. KTN domains are found within the potassium transport 
proteins of several bacteria, where they presumably bind nucleotides that serve as allosteric regulators of channel activity. 


MCP. Methyl-accepting chemotaxis protein. A conserved family of signal transmission domains that generally are fused to a 
chemotactic or other type of sensor domain. The name derives from the fact that these proteins/domains are subject to covalent 
modification by methylation of glutamic acid side chains. Methylation serves to modulate the sentivity of the sensor. MCPs generally act 
on an associated or fused histidine kinase. 


ORF. Open reading frame. 


Continued on following page 
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Table 1. Continued 


PAS domain. PAS is an acronym formed from the names of three members of the protein family, Per, the period clock protein from 
Drosophila melanogaster; Arnt, an aryl hydrocarbon nuclear transporter from vertebrates; and Sim, “single-minded” protein from 
Drosophila. PAS domains provide a scaffold for a variety of different prosthetic groups, can be mounted to serve as sensors for light, gases, etc. 


PPM. One of the two most abundant families of protein-serine/threonine phosphatases. The membership of the PPM family includes 
protein phosphatase-2C and SpollE. 


PPP. One of the two most abundant families of protein-serine/threonine phosphatases. The membership of the PPP family includes 
protein phosphatase-1, protein phosphatase-2A, and protein phosphatase-2B (calcineurin). 


Protein kinase (PK). Any enzyme that catalyzes the transfer of a phosphoryl group from a donor substrate such as ATP to an amino 
acid side chain of a protein. The amino acid side chains most frequently targeted by this covalent modification include the hydroxyl 
moieties of serine, threonine, and tyrosine; the carboxylate of aspartic acid; and the nitrogen atoms within the imidazole ring of histidine. 
If a PK catalyzes its own modification by phosphorylation, it is said to be autophosphorylated. Autophosphorylation may occur in cis or 
trans. Five major superfamilies of protein kinases have been identified to date, the “eukaryote-like” protein kinases, histidine kinases, 
myosin heavy chain/eIF-2 or a-kinases, the isocitrate dehydrogenase kinase/phosphatases, and the HPr kinases. 


Protein phosphatase. Any enzyme that catalyzes the hydrolysis of the covalent bond linking a phosphoryl group to an amino acid side 
chain on a protein. 


PTP. A general term for protein-tyrosine phosphatases. Specific families of PTPs include the conventional PTPs (cPTPs) and low- 
molecular-weight PTPs (LMW PTPs). 


Receptor. Any protein containing a functionally competent sensor domain. 


Redox regulation. Redox stands for reduction-oxidation. Redox regulation refers to a change in functional status effected by the 
addition or removal of one or more electrons. This may (e.g., disulfide formation) or may not (e.g., transition ferrous to ferric iron) be 
accompanied by the formation and/or rupture of a covalent bond. 


Response regulator domain. A domain that autophosphorylates on a conserved aspartic acid residue using an autophosphorylated 
histidine kinase as substrate. Response regulator domains are commonly fused to or associated with transcription factors or modulators 
of flagellar motor proteins. 


Second messenger. Second messengers, such as cAMP or Ca", are dedicated allosteric effectors that are introduced into the interior of 
a cell in response to an extracellular signal (the first messenger). The allosteric effector serves as a surrogate, or second, messenger for the 
external signal. 


Sensor. The molecular species with which the signal directly interacts. Common sensors include ligand-binding sites, individual amino 
acid side chains, and prosthetic groups such as heme. 


Sensor protein/domain. The minimum macromolecular unit that can serve as a functionally competent sensor. 


Sensor-response. The selective alteration of one or more cellular processes as a consequence of the appearance, disappearance, or shift 
of a specific internal or external signal. 


Sensor-response pathway, sometimes referred to as a signal transduction cascade. The molecules responsible for effecting selective 
alterations in cellular processes in response to changes in a specific internal or external signal: 


Signal — signal-response machinery — target > cellular effect. 


Signal. Any species that acts via a sensor-response pathway to effect a specific cellular response. Signals may be biological (e.g., other 
cells and organisms), chemical (e.g., toxins, nutrients, metabolites, pH), or physical (e.g., surfaces, light, temperature, pressure) in nature. 


Signal transduction. The process of sensing and responding to a signal that originates from a source external to the cell, such as the 
surrounding environment or other cells. This subset of general sensor-response events is defined by the requirement to relay or 
“transduce” the signal, either directly or indirectly, across the barrier of the cell membrane. 


Signal transmission protein/domain (also referred to as a transmitter or signaling domain). A protein or domain thereof whose function 
is to regulate another protein. Transmitters may act by producing a second messenger, by binding to or dissociating from another protein, 
or by catalyzing the covalent modification of another protein. It is not uncommon for the multiple transmission proteins to be linked 
together to form a signal transduction cascade. 


Target. A protein(s) (e.g., transcription factors, enzymes, cytoskeletal components) whose functional properties must ultimately be 
altered (regulated) to effect the desired cellular response. 


Transmembrane receptor. A receptor that spans the cell membrane. In general, transmembrane receptors are configured with their 
sensor domain facing the exterior of the cell with their transmitter domain or associated transmitter protein facing the interior. 


Two-component system (sometimes referred to as the His-Asp phosphorelay system). A signal transmission unit whose basic core 
consists of a histidine kinase domain and a response regulator domain. Two-component systems are found in most bacteria, as well as 
some archaea and eucarya. In the Archaea, the two-component paradigm is employed almost exclusively for the control of chemo- and 
phototaxis (see Chapter 18) and gene transcription (see Chapter 6). 
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Figure 2. Basic elements of biological sensor-response pathways. (A) Hypothetical multistep signal transduction pathway in 
which an external signal (open diamond) interacts with a transmembrane receptor complex to activate a target protein. The 
sensor-response pathway comprises two steps. In the first, the transmission domain of the receptor complex produces a sec- 
ond messenger (filled circle) that, in turn, serves as an allosteric activator for a second transmission domain (hatched circle) that 
catalyzes the covalent modification (open triangle) of the target (open quadrilateral). In this example, binding of the allosteric 
ligand and covalent modification both activate their respective target proteins by altering their conformation. (B) Hypotheti- 
cal multistep biosynthetic pathway that is subject to feedback inhibition by one of the products of the final enzyme in the 
pathway (filled diamond). In this case, the indicator metabolite binds to and allosterically activates a sensor-transmitter fu- 
sion protein that subsequently binds to and inhibits the activity of the first enzyme in the pathway (target). The second and third 


enzymes in the pathway are denoted by diagonal hatching and cross hatching, respectively. See Table 1 for definitions of 
terms used. 
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Figure 2. Continued 


receptor domain is fused to an MCP-family signal 
transmitter domain (42). HemAT-Hs is a soluble pro- 
tein in which a myoglobin-like domain is fused to an 
MCP (118). In addition to HemAT-Hs, simple globins 
lacking fusion domains have been characterized from 
Aeropyrum pernix and M. acetivorans (89). These ar- 
chaeal “hemoglobins” bind NO and CO as well as 
oxygen in vitro. It has been suggested that redox-me- 
diated interconversion of the heme iron between the 
Fe(II) and Fe(III) state (the latter of which does not 


bind gases) may occur in these heme proteins, thereby 
enabling them to sense changes in both oxygen levels 
and redox state (43). 

The transcription factor Bat from Halobacterium 
sp. NRC-1 (106) is an example of the third, and po- 
tentially largest and most diverse type of heme-based 
oxygen sensor in the Archaea. Bat is a soluble protein 
containing a PAS domain, a putative cGMP-binding 
domain, and a helix-turn-helix domain (22). The 
acronym PAS is derived from Per, the period clock 
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Table 2. Responses of the Archaea to environmental cues 
Archaeon Environmental cue Response monitored References 
Archaeoglobus fulgidus pH, temperature, metals, oxygen, Biofilm production 166 
xenobiotics 
Salinity, temperature Osmolyte levels 204 
Heat shock Gene transcription 244 
Ferroplasma acidarmanus Copper Gene transcription and protein expression 21 
Haloarcula marismortui Glucose and acetate Enzyme activity 39 
Halobacterium NRC-1 Light Transcriptome and proteome 22 
Halobacterium halobium Glucose, acetate, benzoate, NiSO4, Chemotaxis 261, 278 
histidine, asparagines, leucine, 
methionine, quinine, phenol 
Light Phototaxis 112, 278 
Light Protein phosphorylation 277,279 
Halobacterium salinarium Betaine, choline, carnitine Chemotaxis 154 
Diesel oil Proteomics 174 
Haloferax mediterranei Growth phase, salinity mRNA stability 124 
Growth phase, salinity, light, oxygen Gene transcription 230 
Haloferax volcanii Growth phase Gene transcription 239 
Heat shock Gene transcription 159, 295, 296 
Salinity Gene transcription 28 
Salinity Proteome 210 
Amino acids Transporter activity 133 
Halomonas elongata Salinity Proteome 210 
Halorubrum sp. Osmotic shock Membrane composition 172 
Metallosphaera sedula Heterotrophy vs. chemolithotrophy Gene transcription 134 
Heat shock, nutrient levels Proteome, respiration 226 
Methanothermobacter Amino acids Enzyme activity 97 
thermautotrophicus 
Temperature Osmolyte levels 111 
Methanococcoides burtonii Cold adaptation Proteome 99, 100 
Methanococcus igneus Salinity, temperature Osmolyte levels 53 
Methanococcus jannaschii Salinity Gene transcription 231 
Heat shock, cold shock Gene transcription 35 
Methanococcus maripaludis Salinity Gene transcription 231 
Alanine Gene transcription and enzyme activity 182 
Ammonia Enzyme activity 140 
Ammonia Gene transcription 56 
Hydrogen Gene transcription 321 
Methanococcus Salinity Osmolyte synthesis 202 
thermolithotrophicus 
Methanococcus voltae Acetate, isoleucine, leucine Chemotaxis 271 
Methanohalophilus portucalensis Salinity Osmolyte levels 243 
Methanosarcina acetivorans Salinity Gene transcription 231 
Methanol Gene expression and proteome 247 
Methanosarcina barkeri Salinity Gene transcription 231 
Ammonia Enzyme activity 185 
Salinity Osmolyte levels 38 
Methanosarcina mazei Salinity Gene transcription 231 
Heat shock Transcription factor level 63 
and/or activity 
Heat shock Gene transcription 165 
Carbon and nitrogen sources Gene transcription 304 
Methanosarcina thermophila Salinity Osmolyte level 275 
Methanothermus fervidus Temperature Osmolyte levels 111 
Methanothermus sociabilis Temperature Osmolyte levels 111 
Natronobacterium pharaonis Light Phototaxis 259 
Pyrococcus strain ES4 Heat shock Proteome 116 
Pyrococcus furiosus Cold shock Transcriptome 313 
Heat shock Transcriptome 269 
Salinity, temperature Osmolyte levels 203 
Elemental sulfur Enzyme activity 1 
Carbon source Gene transcription 68 


Continued on following page 
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Table 2. Continued 


Archaeon Environmental cue Response monitored References 
Carbon source Enzyme activity and gene 302 
transcription 
Carbon source Transcriptome 263 
Pyrodictium occultum Heat shock Protein level 232 
Sulfolobus acidocaldarius UV light Genetic reversion and 320 
recombination frequency 
Sulfolobus shibatae Heat shock Protein level 298 
Sulfolobus solfataricus Carbon source Gene transcription 86, 109, 123 
UV irradiation and actinomycin D Gene transcription 257 
Lysine Gene transcription 40 
Thermococcus strain ES-1 Elemental sulfur Enzyme activity 194 
Thermococcus barophilus Pressure Proteome 201 
Thermococcus litoralis Carbon source Gene transcription 68 
Thermoproteus tenax Autotrophy vs. heterotrophy Gene transcription and 45 


enzyme activity 


protein from Drosophila melanogaster; Arnt, an aryl 
hydrocarbon receptor nuclear transporter from ver- 
tebrates; and Sim, “single-minded” protein from 
Drosophila (293). PAS domains provide a scaffold 
on which several different prosthetic groups can be 
mounted, including heme, FAD, or the chromophore 
4-hydroxycinnamic acid. The nature of the prosthetic 
group determines whether a particular PAS domain 
senses oxygen, redox state, or light. PAS domains are 
found in a variety of archaea, including A. fulgidus, 
Haloarcula marismortuii, Halobacterium sp. NRC-1, 
H. salinarium, Methanococcoides burtonii, Methano- 
spirillum hungatei, M. thermautotrophicus, M. ace- 
tivorans, Methanosarcina barkeri, M. mazei, Natro- 
nomonas pharaonis, and P. abyssi, where they are 
oftentimes fused with either an MCP domain or a 
two-component histidine kinase (9, 19, 94, 293). 
However, empirical evidence verifying the functional 
properties of these receptorlike proteins is lacking. 


Osmolarity 


Archaea such as Haloferax volcanii (172), M. jan- 
naschii (147), Thermoplasma acidophilum (148), and 
Thermoplasma volcanium (148) sense and adapt to 
osmotic stress through the action of mechanosensitive 
ion channels. Rather than sampling solute concentra- 
tion directly, these channels respond to stress-induced 
changes in the physical properties of the cell mem- 
brane. These channels are widely dispersed through- 
out the phylogenetic spectrum, implying early evolu- 
tionary origins (149). 


Temperature 


The farnesyl diphosphate/geranylgerany! diphos- 
phate synthase, Tk-IsdA, from the extreme thermophile 


Thermococcus kodakaraensis catalyzes the condensa- 
tion of C; isoprenoid units with allylic diphosphate 
to produce C15- and C9-diphosphates. These poly- 
isoprenoids serve as precursors for the synthesis of 
squalene and its derivatives or membrane di- and 
tetraether lipids, respectively. Product composition is 
controlled by temperature, with increased tempera- 
ture favoring synthesis of C15 over C29 products (90). 
The mechanism of regulation involves a temperature- 
mediated conformational change that alters the posi- 
tion of a gating tyrosine residue, Tyr-81. 


Electrical potential 


The activity of the flagellar motor proteins that 
regulate the swimming behavior of the haloarchaeon 
H. salinarium is sensitive to changes in proton motive 
force across the cell membrane. Gene knockout stud- 
ies have implicated a membrane-bound MCP, MpcT, 
as the relevant sensor (152). As McpT is a small pro- 
tein lacking a recognizable extracellular domain, it is 
proposed that McpT responds directly to changes in 
membrane potential, rather than sensing shifts in the 
level of some chemical species such as H*. 

In general, the translocation of potassium across 
the membranes of archaeal cells is mediated by ho- 
mologs of the channels responsible for “gating” ion 
currents during neurotransmission in eucarya. The 
archaeal proteins KvAP from A. pernix (253, 254) 
and MVP from M. jannaschii (265) display all the 
features characteristic of the voltage-gated ion chan- 
nels of the Eucarya. Recombinant versions of the 
channels are selective for potassium and responsive to 
changes in membrane polarization. In addition, both 
KvAp and MVP can be inactivated by the spider- and 
scorpion-derived toxins that target their eucaryal 
counterparts. It has been postulated that the voltage- 
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gated ion channels of the Archaea play a role in en- 
vironmental adaptation and metabolic changes (158). 


Cryptic receptors 


A variety of putative receptors have been identi- 
fied based on circumstantial evidence in the form of 
(i) potential transmembrane topography; (ii) fusion to 
a likely signal-propagating domain such as an MCP, 
histidine kinase, protein-serine/threonine/tyrosine ki- 
nase, or protease; and (iii) conservation across a large 
number of organisms or association with multiple 
classes of signaling domains (96, 300). However, 
the ligands for these putative receptors have yet to be 
identified. A handful of these domains have been en- 
countered in archaeal genomes, including CACHE, 
CHASF4, and CHASE6. 

CACHE refers to an extracellular domain con- 
served among certain calcium channels and chemo- 
taxis receptors (7). Based on its fusion with some bac- 
terial MCPs, it has been suggested that CACHE 
domains bind small ligands such as citrate or amino 
acids. Archaea encoding potential CACHE-contain- 
ing receptor proteins include Halobacterium sp. NRC- 
1, M. burtonii, M. acetivorans, M. mazei, M. hungatei, 
and P. horikoshii (7, 14, 19, 292). Members of the 
genus Methanosarcina encode polypeptides in which 
both CACHE and MCP domains are fused to a po- 
tential ligand-gated ion channel similar to the acetyl- 
choline receptors of the higher Eucarya (292). 

CHASE stands for cyclase/histidine kinase-asso- 
ciated sensing extracellular (8, 214). Six classes of 
CHASE domains have been deduced using computa- 
tional methods, two of which are found in the Archaea 
(CHASE4 and CHASE6). CHASE4 domains form the 
predicted extracellular domains of histidine kinases in 
A. fulgidus, M. acetivorans, M. barkeri, and M. mazei 
(19, 327). A CHASE6 domain is fused to a domain of 
unknown function in Halobacterium sp. NRC-1 (327). 

In addition, a variety of deduced transmembrane 
and lipid-anchored secreted proteins have been identi- 
fied whose sequence indicates that they possess ligand- 
binding potential. However, these proteins lack en- 
dogenous intracellular signal transmission domains 
(Table 1). Thus, they would require an associated MCP, 
adenylate cyclase, protein kinase, membrane channel, 
phosphodiesterase, or other partner to form a func- 
tionally competent signal transduction unit. Potential 
ligands include metals (31, 208), di- and oligonu- 
cleotides (3, 31), amino acids and dipeptides (31), phos- 
phate (31), and sugars and sugar phosphates (3, 31). 
Other extracellular proteins contain domains impli- 
cated in cell-cell adhesion events in the Eucarya such as 
B-helix domains, B-propeller domains, and polycystic 
kidney disease domains (130, 233). 


SECOND MESSENGERS 


The term “second messenger” was coined to de- 
scribe a set of specialized intracellular signaling mole- 
cules that are synthesized or released in response to 
the binding of ligands, or first messengers, to extra- 
cellular receptors. The best-known example of a sec- 
ond messenger is 3’,5’-cyclic AMP (cAMP), which is 
produced by the enzyme adenylate cyclase when the 
hormone epinephrine binds to the B-adrenergic re- 
ceptor. Second messengers in general elicit cellular ef- 
fects by acting as allosteric modulators of key pro- 
teins (Fig. 2). Oftentimes allosteric regulators bind to 
and modulate the activity of other signal transmission 
proteins, such as protein kinases or protein phos- 
phatases, which in turn act on “downstream” target 
proteins. The resulting multistep signal transduction 
cascade provides both a means for constructing 
branches for reaching multiple proteins and, perhaps 
more importantly, a source of connective nodes 
within cellular sensor-response networks that facili- 
tate the integration of multiple inputs. In other cases, 
a second messenger will bind to and directly modu- 
late the enzymes and other proteins that form the ul- 
timate targets of a sensor-response pathway. 


Cyclic Nucleotides 


Cyclic nucleotides such as cAMP, 3’,5'-cyclic 
GMP (cGMP), and bis(3’,5'-cyclic diguanylic) acid 
(c-diGMP) are universally employed as regulators of 
gene expression and metabolic activity in bacteria 
and eucarya. However, although the presence of 
cAMP in the Archaea was reported nearly twenty 
years ago (175), very little is known about the role 
of cyclic nucleotides in these organisms (30). A siz- 
able majority, 80%, of the Archaea encode potential 
class 2 adenylate cyclases within their genomes (95). 
However, stereotypical cyclic nucleotide phosphodi- 
esterases appear to be lacking (95). 

c-diGMP is a novel second messenger that was 
recently discovered in the Bacteria. Examination of 
archaeal genome sequences for potential diguanylate 
cycles has yielded equivocal results. Pei and Grishin 
(227) reported that several proteins from the Archaea, 
A. fulgidus, M. jannaschii, M. thermautotrophicus, 
and P. horikoshii, contain the GGDEF motif that is 
characteristic of bacterial diguanylate cyclases. How- 
ever, Galperin (95) argues that any perceived resem- 
blance is too faint to constitute a reliable predictor 
of physiologic function. 

While few archaeal proteins display the classic 
cyclic nucleotide-binding sites commonly found in the 
Eucarya and Bacteria (233), several possess a more re- 
cently recognized class of cyclic nucleotide-binding 
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motif, the GAF domain (9). GAF is an acronym as- 
sembled from the names of the proteins in which it was 
first identified, i.e., the cGMP-stimulated phosphodi- 
esterases, Anabaena adenylate cyclases, and the bacte- 
rial transcription factor FhlA (16, 113). Since most ar- 
chaea contain only a single recognizable cyclase within 
their genomes, these archaeal GAFs are likely to bind 
cAMP instead of cGMP. Many of these GAF domains 
are fused to two-component histidine kinases (96), 
suggesting that archaea may utilize cAMP-mediated 
phosphorylation cascades. The triggers for these im- 
plied cascades remain to be elucidated, as none of the 
deduced archaeal adenylate cyclases identified to date 
contain a transmembrane domain (95), nor do they 
possess other recognizable functional domains beyond 
those implicated in catalysis (267). 


Calcium 


In the Eucarya, calcium has long vied with cAMP 
for the title of most prolific second messenger. While 
it has been speculated that calcium plays a prominent 
role as a second messenger in bacteria, little evidence 
has emerged to support this supposition (70, 209, 
272). Although a protein capable of activating a cal- 
modulin-dependent phosphodiesterase in a calcium- 
dependent manner was isolated from H. salinarium 
(246), examinations of archaeal genome sequences 
failed to reveal the presence of the calmodulin/parv- 
albumin family’s characteristic calcium-binding motif, 
the EF hand (233, 242). Nonetheless, many archaea 
possess the basic prerequisites for modulating the level 
of free calcium in their cytosol: channels for the ad- 
mittance of calcium into their cytoplasm (49) and the 
ATP-driven pumps required to maintain a low basal in- 
tracellular calcium concentration (65, 303). 

Several secreted hydrolases from the Archaea 
(98, 145, 236, 248, 258) bind to and are stabilized by 
calcium. However, it is difficult to envision a regula- 
tory or signaling function for calcium involving these 
extracellular enzymes. The handful of intracellular 
proteins that have been reported to reversibly bind 
calcium include an ATP-dependent DNA ligase from 
Sulfolobus shibatae (163), as well as several proteins 
from M. thermautotrophicus, namely elongation fac- 
tor-18, MTH1880 (173), and a calcium-gated potas- 
sium channel (129). Only the calcium-gated potas- 
sium channel exhibited calcium-dependent effects on 
its activity or structure that were characteristic of the 
binding of an allosteric ligand. 


Inositol Phosphates 


In the Eucarya, multiply phosphorylated forms 
of the hexose inositol serve as second messengers in 


signal transduction (121). While many thermophilic 
archaea are capable of synthesizing phosphorylated 
forms of inositol, these compounds are used as com- 
patible solutes to combat osmotic stress and appear 
to have no role in cellular regulation (216). 


ALLOSTERIC REGULATION BY 
METABOLITES 


Regulation via the binding of dissociable effec- 
tors is not confined to dedicated second messengers 
such as cAMP. A variety of metabolites also serve as 
allosteric regulators of key enzymes. In the most ba- 
sic form of allosteric regulation (feedback inhibi- 
tion), a biosynthetic end product inhibits the activity 
of an enzyme responsible for catalyzing an early 
committed step in its biosynthesis (224). Early in- 
tervention is more efficient as it avoids the accumu- 
lation of unneeded pathway intermediates. Other 
“indicator metabolites” are employed to coordinate 
flux through related pathways, such as those pro- 
ducing the purine and pyrimidine nucleotide build- 
ing blocks of DNA, or to intervene when changes in 
global cellular parameters, such as energy status or 
redox state, dictate that local control mechanisms be 
overridden. 


Acetolactate Synthase 


Acetolactate synthase catalyzes the first commit- 
ted steps in the synthesis of the branched-chain 
amino acids valine, leucine, and isoleucine: the con- 
densation of pyruvate with either a second molecule 
of pyruvate or 2-ketobutyrate to yield carbon diox- 
ide and either acetolactate or acetohydroxybutyrate, 
respectively. The acetolactate synthase activities 
in extracts from three species of Methanococcus, 
M. aeolicus, M. maripaludis, and M. voltae, are sen- 
sitive to feedback inhibition by branched-chain amino 
acids (323). However, the pattern of inhibition ob- 
served was species specific. The acetolactate synthase 
activity in extracts from M. aeolicus was sensitive 
solely to valine, while the activity in extracts from 
M. voltae could be inhibited by either valine or iso- 
leucine, suggesting that the enzyme in M. voltae was 
sensitive to both amino acids. While the acetolactate 
synthase activity in extracts of M. maripaludis also 
was sensitive to valine and isoleucine, the maximum 
degree of inhibition obtained with either amino acid 
was only about 50%. Such behavior is suggestive of 
the existence of either two isoforms of the enzyme, 
each of which was sensitive to a single feedback 
regulator, or possibly, additive inhibition of a single 
enzyme. 
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Aspartate Transcarbamoylase 


The enzyme aspartate transcarbamoylase (AT- 
Case) catalyzes the initial step in de novo pyrimidine 
biosynthesis. The form of ATCase found in Escherichia 
coli is used, literally, as a textbook example for two 
important phenomena: cooperativity and allosteric 
regulation (184). The enterobacterial enzyme is a do- 
decamer comprising equal numbers of catalytic and 
regulatory subunits. Binding of the substrate, as- 
partate, results in homotypic positive cooperativity. 
E. coli ATCase is subject to feedback inhibition by 
the pathway end product CTP, whose efficacy can be 
potentiated by a second pathway end product, UTP. 
ATP activates the enzyme via a mechanism that 
largely overrides the inhibitory effects of CTP, thus 
ensuring that pyrimidine nucleotide biosynthesis 
keeps pace with purine nucleotide biosynthesis during 
periods of vigorous biosynthetic activity. 

Although the evolutionary history of the car- 
bamoyltransferases is a complex one, the archaeal 
ATCases appear to constitute a monophyletic group 
(160, 161). Nevertheless, those archaeal ATCases 
studied to date exhibit significant diversity in their ki- 
netic and regulatory properties. Those from Halo- 
bacterium cutirubrum (219) and P. abyssi (235, 301) 
displayed the type of cooperative kinetic behavior 
and pattern of allosteric regulation characteristic of 
the iconic ATCase from E. coli. On the other hand, 
the enzyme from M. jannaschii displays little evidence 
of either cooperativity or allosteric regulation by nu- 
cleotides (107). Intermediate between these extremes 
is the ATCase from S. acidocaldarius, which displays 
cooperative kinetics and activation by ATP, but is ac- 
tivated rather than inhibited by CTP and UTP (71). 


Biosynthesis of Aromatic Amino Acids 


The aromatic amino acids phenylalanine, tyrosine, 
and tryptophan are synthesized via a branched path- 
way whose last common step is the synthesis of choris- 
mate. In many bacteria, total flux through the pathway 
and distribution of that flux among its branches are reg- 
ulated by an elegant network of allosteric feedback 
events. While genomic and biochemical analyses indi- 
cate that flux through the early, common portion of 
the pathway responsible for the de novo synthesis of 
aromatic amino acids is not subject to allosteric feed- 
back inhibition in the Archaea (262), considerable 
evidence has appeared for allosteric control of end- 
product distribution. 

The first branch point in the biosynthetic path- 
way for the aromatic amino acids is formed by an- 
thranilate synthase and prephenate synthase, which 
direct flux from chorismate toward tryptophan and 


phenylalanine plus tyrosine, respectively. The de- 
duced amino acid sequence of the anthranilate syn- 
thase from H. volcanii reportedly contains the con- 
served residues involved in feedback inhibition of its 
bacterial homologs (164). Examination of anthranilate 
synthases isolated from A. fulgidus (47), M. therm- 
autotrophicus (97), S. solfataricus (299), and Ther- 
mococcus kodakaraensis (291) revealed that each is 
inhibited by tryptophan. 

The second branch of the pathway bifurcates im- 
mediately after the synthesis of prephenate from cho- 
rismate. Prephenate dehydrogenase catalyzes the first 
committed step in the synthesis of tyrosine, while 
prephenate dehydratase catalyzes the first committed 
step in the synthesis of phenylalanine from prephen- 
ate. Like its bacterial counterpart, the prephenate de- 
hydrogenase from the haloarchaeon Methanohalo- 
philus mabhii is subject to feedback inhibition by its 
cognate end product, tyrosine (87). Likewise, the 
prephenate dehydratase from this organism is subject 
to multivalent feedback control (87). Phenylalanine 
inhibits the enzyme in vitro, while amino acids such 
as tyrosine, methionine, leucine, and isoleucine acti- 
vate it. Similarly, the prephenate dehydrogenase from 
Halobacterium vallismortis is inhibited by phenylala- 
nine and activated by isoleucine (127). While several 
other amino acids were also observed to exert al- 
losteric effects in vitro, the concentrations required 
were supraphysiological. Feedback regulation is ap- 
parently not ubiquitous among the Archaea. It has re- 
cently been reported that both the prephenate dehy- 
dratase and the prephenate dehydrogenase activities 
in M. maripaludis are insensitive to aromatic amino 
acids (234). 


dCTP Deaminase/dUTP Diphosphatase 


dCTP deaminase/dUTP diphosphatase represents 
a uniquely archaeal fusion, within a single protein 
domain, of the active-site components responsible for 
catalyzing two reactions in the synthesis of uridine 
and deoxyuridine nucleotides from cytosine and de- 
oxycytosine nucleotides (120). The deoxyuridine nu- 
cleotides produced by this enzyme also serve as the 
precursors to deoxythymine nucleotides, such as 
dTTP, which acts as a feedback inhibitor of this bi- 
functional enzyme (180). 


Flagellar Motors 


Switching the direction of rotation of flagellar 
motors in response to chemo-, aero-, and phototactic 
signals in H. salinarium requires the presence of the 
tricarboxylic acid (TCA) cycle intermediate fumarate 
(206). Fumarate serves as a switching or potentiating 
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factor whose binding to flagellar motors renders the 
proteins responsive to CheY, the terminal phospho- 
protein of the two-component signaling cascade (see 
“Two-component system,” below) that becomes acti- 
vated upon stimulation of the haloarchaeon’s chemo-, 
aero-, and photoreceptors (212). The switch itself is 
under the control of rhodopsin-containing photore- 
ceptors, whose activation triggers the release of a 
pool of membrane-bound fumarate molecules (211). 
While not a second messenger in the strictest defini- 
tion of the term, its sequestration from central me- 
tabolism (see Chapter 12) renders this membrane- 
bound pool of fumarate the functional equivalent of a 
“classic” second messenger such as calcium. 


Glyceraldehyde 3-Phosphate Dehydrogenase 


The glycolytic enzymes of the Archaea display 
little evidence of the type of multinodal allosteric 
control so frequently observed in the Bacteria and 
Eucarya (305). A notable exception to this pattern is 
the unusual NAD*-dependent, nonphosphorylating 
glyceraldehyde 3-phosphate dehydrogenase (GAPN) 
from the archaeon Thermoproteus tenax. The cat- 
alytic activity of this enzyme is sensitive to several 
metabolites (44, 162). NADP(H), NADH, and ATP 
inhibit GAPN by reducing its affinity for NAD*, 
while AMP, ADP, glucose 1-phosphate, and fructose 
6-phosphate act in a reciprocal fashion. When the en- 
zyme was incubated with equivalent concentrations 
of the most potent inhibitor (NADPH) and activator 
(glucose 1-phosphate), a net twofold activation of the 
enzyme was observed despite the fact that the K4 of 
the former was the lower of the two: 0.3 versus 1.0 
uM (44). Thus, binding of allosteric activators may 
override the effects of allosteric inhibitors for affect- 
ing the catalytic efficiency of the enzyme. 


KtrA Potassium Transporter 


The genome of the methanoarchaeon M. jan- 
naschii encodes two proteins whose sequences sug- 
gest that they mediate the active transport of potas- 
sium ions. One of these, KtrA, contains a cytoplasmic 
KTN (K* transport, nucleotide binding) domain that 
includes a Rossmann nucleotide-binding fold (158), 
and binds NAD* or NADH in vitro (245). A com- 
parison of X-ray crystal structures indicates that the 
NADH and NAD* act in a reciprocal manner on the 
channel. NAD* impedes opening of the channel by 
binding in a manner that restricts movement of the 
hinge region of its protein subunits (245). However, 
when NADH is bound these hinge regions remain 
flexible, thereby permitting potassium transport to 
proceed (245). It therefore appears that importation 


of potassium is regulated by the energy status of 
the organism, as reflected by the ratio of the reduced 
versus oxidized form of the indicator metabolite 
NAD(H). 


Nitrogen Metabolism 


Glutamate dehydrogenase plays a central role in 
nitrogen metabolism in all living organisms (48). The 
haloarchaeon, H. salinarium, contains two forms of 
the enzyme, one NADP* dependent (NADP-GDH) 
and another NAD* dependent (NAD-GDH). A com- 
parison of the K,,, values of the two glutamate dehy- 
drogenases for ammonia indicates that NADP-GDH, 
which exhibits a very low K,,, for ammonia, acts to 
funnel nitrogen into anabolic pathways (34). On the 
other hand, NAD-GDH is thought to participate in 
amino acid catabolism (33). During catabolism, the 
oxidative deamination of glutamate yields NADH 
and 2-oxoglutarate; the latter is further oxidized via 
the Krebs cycle and ultimately yields ATP. NAD-GDH 
is activated by a variety of amino acids and inhibited 
by several Krebs cycle intermediates, including fu- 
marate, oxaloacetate, succinate, and malate (32). It 
has not been firmly established whether these in- 
hibitory dicarboxylates bind to the active site or to a 
distinct, allosteric regulatory site. However, each was 
observed to be kinetically noncompetitive with re- 
spect to both glutamate and NAD*, consistent with 
an allosteric mode of action. 

Several archaea are capable of fixing atmospheric 
nitrogen when other sources of this key element are 
lacking (176). In diazotrophic bacteria, 2-oxoglu- 
tarate (a central intermediate in nitrogen metabolism) 
serves as a critical indicator of intracellular nitrogen 
levels (218). Therefore, it is not surprising that in cer- 
tain archaea, 2-oxoglutarate stimulates the activity 
of key enzymes in nitrogen metabolism: nitrogenase 
from M. maripaludis (69) and glutamine synthetase 
from M. mazei (72). Both of these enzymes associate 
with the archaeal homolog of the PII protein found in 
photosynthetic plants and bacteria. In these latter or- 
ganisms, binding of 2-oxoglutarate to PII negates this 
protein’s ability to inhibit the activity of the nitrogen- 
metabolizing enzymes with which it associates. It ap- 
pears likely that archaeal PII proteins also mediate the 
action of this indicator metabolite. 


Phosphoenolpyruvate Carboxylase 


Phosphoenolpyruvate carboxylase (PEPC) cat- 
alyzes the synthesis of oxaloacetate, a key intermediate 
in the Krebs cycle. The PEPCs from S. acidocaldarius 
(256) and S. solfataricus (79) are inhibited by malate 
and the amino acid aspartate, a behavior exhibited by 
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bacterial and eucaryal PEPCs. By contrast, PEPC from 
Methanococcus sociabilis (255) is insensitive to these 
compounds. 


Ribonucleotide Reductase 


Ribonucleotide reductase catalyzes the synthesis 
of deoxyribonucleotides from corresponding ribonu- 
cleotides. Thus far, only two archaeal ribonucleotide 
reductases, members of the class II family of co- 
enzyme B,5-dependent enzymes, have been character- 
ized (75, 240). Feedback inhibition by dATP, an in- 
dicator of intracellular dNTP pools, is commonly 
observed among the ribonucleotide reductases. It is 
therefore somewhat surprising that the ribonucleotide 
reductase from the archaeon T. acidophilum displays 
little sensitivity to dATP. However, the enzyme does 
exhibit a complex pattern of feedback regulation in 
which other nucleotides modulate both its catalytic 
efficiency and substrate specificity (75). For example, 
dTTP markedly stimulates activity toward GDP, 
while inhibiting activity toward ADP and CDP by 
about one-half. dGTP, on the other hand, stimulates 
activity toward ADP and inhibits activity toward 
CDP and GDP. dCTP stimulates activity toward GDP 
with little effect on activity toward either ADP or 
CDP. While stimulation of activity toward GDP by 
dCTP appears logical because the guanine and cyto- 
sine form one of the base pairs in double-stranded 
DNA, the effects of the other feedback regulators are 
less easily rationalized. The enzyme from P. furiosus 


behaves more conventionally, as it reportedly is 
“most strongly” inhibited by dATP (240). 


Uracil Phosphoribosyltransferase 


Uracil phosphoribosyltransferase catalyzes con- 
densation of uracil with phosphoribosylpyrophos- 
phate to form UMP and pyrophosphate, a key step 
in the pyrimidine salvage pathway. The uracil phos- 
phoribosyltransferase from the archaeon S. solfatari- 
cus is dramatically stimulated by GTP. When GTP is 
saturating, keat increases nearly 20-fold while the K,,, 
values for uracil and PRPP decrease 10- and 2-fold, 
respectively (17, 126). UMP inhibits the enzyme in a 
hierarchal, CTP-dependent manner. While feedback 
regulation of the de novo biosynthesis of pyrimidines 
by pathway end products and feed-forward stimula- 
tion by purine nucleotides are common among bac- 
teria and archaea (see “Aspartate transcarbamoylase,” 
above), this behavior is atypical for pyrimidine sal- 
vage pathways. Its unique status renders the sophis- 
ticated allosteric regulatory mechanism of the uridine 
phosphoribosyltransferase from S. solfataricus even 
more intriguing. 


Other Potential Binding Domains for 
Allosteric Ligands 


Examination of archaeal genomes has revealed 
the presence of several domains proposed to bind 
small molecule ligands (Table 3). In many cases, these 


Table 3. Intracellular small molecule-binding domains in the Archaea* 


Domain name Acronym Hypothetical ligand(s) 
Aspartokinase, chorismate mutase, TyrA ACT Amino acids, purines 
ATP cone N/A ATP 
Cystathione B-synthase CBS S-Adenosylmethionine 
Double-stranded B-helix DSBH Carbohydrates, oxalate, amino acids 


Ferredoxins Fx 


cGMP phosphodiesterase, adenylate GAF 
cyclase, FhlA 
Heavy metal-associated HMA 
Nitrogen fixation protein X NIFX 
Per-Arndt-Sim PAS 
Periplasmic binding protein type II PBP-II 
Regulation of amino acid metabolism RAM 
Transcriptional regulators, TRASH 
cation-transporting ATPases, 
and hydrogenases 
Transporter-associated OB domain T-OB 
TrkA protein TRKA 
Universal stress protein A USPA 
Vinyl 4 reductase 4VHR 


Redox state 
Cyclic nucleotides, tetrapyrroles, formate 


Copper and other metals 

Molybdate, iron 

Oxygen, light, redox state 

N-Acetylserine, thiosulfate, amino acids, sugars 
Amino acids 

Heavy metals 


Sulfate, molybdate, sugars 
NAD(H) 

NTPs 

Hydrocarbons 


*Deduced from analyses of genome sequences. For further information, see references 8, 15, 78, and 233. 
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putative ligand-binding domains are fused to addi- 
tional domains that may function as targets for al- 
losteric regulation. These deduced functional do- 
mains include membrane transporters, metabolic 
enzymes, DNA-binding proteins, proteases, and es- 
terases (9, 15). Several examples also have been en- 
countered in which individual polypeptide chains 
contain two or more classes of small molecule-bind- 
ing domains, suggesting a potential role as signal 
integrators. 


PROTEIN-SERINE/THREONINE/TYROSINE 
PHOSPHORYLATION 


The covalent modification of proteins via phos- 
phorylation of the hydroxyl groups of serine, threo- 
nine, and tyrosine represents Nature’s most versatile 
and prolific mechanism for regulating cellular 
processes (322). The intrinsic ability of the phospho- 
ryl group to dramatically alter local charge and hy- 
drophilicity renders it a potent agent for perturbing 
the phyiscochemical characteristics of a protein on 
both a local and a global scale. While regulation by 
allosteric ligands requires the addition of a large, 
three-dimensionally complex binding domain to a 
protein, the basic structural requirement for attaching 
a phosphoryl group consists of a suitably positioned 
amino acid side chain containing a nucleophilic hy- 
droxyl, carboxyl, amino, or imino group. Conse- 
quently, covalent phosphorylation is a regulatory 
mechanism compatible with and adaptable to an ex- 


traordinarily diverse spectrum of proteins. Equally 
important, a phosphorylated protein (phosphopro- 
tein) can be restored to its original, dephosphorylated 
state through the intervention of a protein phos- 
phatase (136). Evidence from both genomics and bio- 
chemistry indicate that the covalent modification of 
proteins via phosphorylation and dephosphorylation 
is a nearly universal attribute of the free-living mem- 
bers of the domain Archaea. 


Phosphoproteins 


Approximately a dozen proteins have been iden- 
tified in the Archaea that are subject to phosphoryla- 
tion on the hydroxyl groups of serine, threonine, and/ 
or tyrosine (Table 4). 


Protein-serine/threonine/tyrosine kinases 


Three protein kinases exhibiting the capacity to 
catalyze their self- or autophosphorylation, a com- 
mon behavior for enzymes of this type, have been 
identified in the Archaea: Rio1 (169) and Rio2 (167, 
168) from A. fulgidus and SsoPK2 from S. solfatari- 
cus (190). While autophosphorylation often consti- 
tutes a prerequisite for the efficient phosphorylation 
of exogenous substrate proteins, this would appear 
not to be the case for either Rio1 or SsoPK2. Versions 
of these proteins that had been rendered incapable of 
autophosphorylation by mutagenic alterations dis- 
played full activity toward exogenous proteins in 
vitro. Inspection of the X-ray structures of the phos- 


Table 4. Archaeal phosphoproteins”? 


Protein Archaeon Phosphoamino acid Function Reference 
Methyltransferase activation M. barkeri n.d.° Autophosphorylated form 59 
protein activates methyltransferase 
90-kDa aminopeptidase S. solfataricus n.d. ND 57 
Cdc6 M. thermautotrophicus P-Ser DNA-stimulated 102 
autophosphorylation 
Cdc6 S. acidocaldarius n.d. ND 64 
Glycogen synthase S. acidocaldarius Acid stable ND 50 
Phenylalanyl-tRNA synthetase T. kodakaraensis P-Tyr ND 128 
Phosphomannomutase T. kodakaraensis P-Tyr ND 128 
RNA terminal phosphate cyclase T. kodakaraensis P-Tyr Nb 128 
SsoPK2 S. solfataricus P-Ser Autophosphorylation 190 
SsoPK3 S. solfataricus P-Thr ND 191 
Initiation factor 2-a P. horikoshii Ser-48 ND 290 
b-Gluconate dehydratase S. solfataricus n.d. Activates 145 
Phosphohexomutase S. solfataricus Ser-309 Inhibits 237 
Riol A. fulgidus Ser-108 Autophosphorylation 169 
Rio2 A. fulgidus Ser-128 Autophosphorylation 168 


‘Included are proteins for which the nature of protein-phosphate bond has been determined, as well as polypeptides that can be radiolabeled with [>7P|phos- 


phate or which are immunoreactive with antibodies against phosphotyrosine. 


’Components of the two-component system are not included (see “Two-component system,” below). 


n.d., not determined. 
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pho and dephospho forms of Rio2 suggest that au- 
tophosphorylation influences ATP binding by this 
protein kinase, but this has not been tested experi- 
mentally (168). By contrast, SsoPK3, a protein-serine 
kinase from the membrane fraction of S. solfataricus, 
does not undergo autophosphorylation in vitro (191). 
However, SsoPK3 is phosphorylated in situ by a sec- 
ond, threonine-preferring protein kinase that resides 
in the membrane of S. solfataricus (191). 


CDC6 and methyltransferase-activating protein 


Several other archaeal proteins known to bind 
ATP have been reported to undergo autophosphory- 
lation, including Cdc6 from both M. thermauto- 
trophicus (102) and S. acidocaldarius (64), and the 
methyltransferase-activating protein from M. barkeri 
(59). While the nature and function of the phospho- 
rylation of Cdc6 remains controversial, autophos- 
phorylation of the methyltransferase-activating pro- 
tein from M. barkeri renders it capable of stimulating 
the activity of the MT-1 methyltransferase (59). Acti- 
vation requires stoichiometric levels of the phospho- 
protein, even in the presence of excess ATP. Hence, 
the autophosphorylated protein would appear to ac- 
tivate its cognate methyltransferase by binding to it. 


Phosphohexosemutases 


In the case of the phosphohexosemutase from 
S. solfataricus, mapping the site of modification on 
the X-ray structure of a bacterial homolog suggested 
in what manner, and by what mechanism, phospho- 
rylation affects the enzyme’s catalytic activity (237). 


ADP 


Protein 
Kinase 


Phosphohexosemutase (active) 


These models placed the site of phosphorylation, Ser- 
309, squarely within the substrate-binding site. Here, 
a phosphoryl group would be expected to inhibit 
catalysis by erecting an electrosteric barrier to sub- 
strate binding (Fig. 3). Introduction of a negative 
charge into the active site by mutagenically altering 
the phosphoacceptor serine to aspartate verified this 
prediction. A phosphohexosemutase from another 
thermophile, T. kodakaraensis, is phosphorylated on 
tyrosine (128). However, neither the location of the 
phosphoacceptor tyrosine residue(s) nor the effect of 
phosphorylation on catalytic activity was reported. 


Phenylalanyl-tRNA synthetase and RTCB 


The tyrosine-phosphorylated phosphomanno- 
mutase from T. kodakaraensis was isolated by affinity 
chromatography using an inactive form of the pro- 
tein-tyrosine phosphatase Tk-PTP (128). Two other 
proteins also adhered to this column and, following 
elution, displayed immunoreactivity toward antibod- 
ies against phosphotyrosine. The first was the B-chain 
of a deduced phenylalanyl-tRNA synthetase. The sec- 
ond was a polypeptide, RTCB, of unspecified func- 
tion that is encoded within the operon for RNA ter- 
minal phosphate cyclase (128). 


Initiation factor 2 


It is highly likely that the phosphorylation of the 
a-subunit of initiation factor 2 from P. horikoshii in- 
hibits this factor in a manner similar to that observed 
in the Eucarya (290). Not only is the site of phos- 
phorylation on the archaeal protein similar to that of 


oro? 
O, OH 
OH HO 


OH 


OPOŽ 


Phosphohexose 


Phosphosehexomutase-PO3 ? (inactive) 


Figure 3. Phosphorylation of phosphohexosemutase from S. solfataricus. Ser-309 on the phosphohexosemutase inhibits catalysis 
by electrosterically interfering with the binding of substrate phosphohexoses. 
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its eucaryal homolog, but it also can be phosphory- 
lated by an RNA-dependent protein kinase (PKR) 
from Homo sapiens. PKR is one of several protein ki- 
nases that are known to phosphorylate eucaryal ini- 
tiation factor 2 in vivo (54). It is therefore particu- 
larly noteworthy that P. horikoshii contains a 
homolog of PKR, PH0512, that phosphorylates ar- 
chaeal initiation factor 2 in vitro (290). 


p-Gluconate dehydratase 


Phosphorylation of D-gluconate dehydratase 
from S. solfataricus appears to activate the enzyme, as 
incubation of the isolated phosphoprotein with alka- 
line or acid phosphatase leads to a concomitant loss 
of phosphate and activity (144). Neither the impact 
nor the specific site of modification by phosphoryla- 
tion appears to have been determined for any of the 
remaining proteins listed in Table 4. 


Protein-Serine/Threonine/Tyrosine Kinases 


At least five distinct molecular paradigms have 
evolved for catalyzing the phosphorylation of pro- 
teins on the side-chain hydroxyl groups of serine, 
threonine, and tyrosine: the eucaryal protein kinases, 
the myosin heavy-chain kinase/elongation factor-2 
kinases, the isocitrate dehydrogenase kinase/phos- 
phatase, the histidine-rich protein (HPr) kinases, and 
the Rsb/Spo kinases (137, 138). Among this quintet, 
the eucaryal protein kinases (ePKs) have emerged as 
the numerically dominant and phylogenetically cos- 
mopolitan species (179, 196, 197, 233). Deduced mem- 
bers of the extended ePK superfamily are found 
throughout the Eucarya (196) and most of the Bacte- 
ria (95, 137). It is therefore perhaps not surprising 
that a majority of archaeal genome sequences encode 
one or more (typically 3 to 5) potential ePKs (95, 
139), or that ePKs constitute the only apparent source 
of protein-serine/threonine/tyrosine kinase activity in 
the Archaea (138). 

Whereas the vast majority of eucaryal and bacter- 
ial ePKs resemble the catalytic subunit of the prototypic 
cAMP-dependent protein kinase (197), most archaeal 
ePKs belong to the RIO and piD261 subfamilies (179). 
These subfamilies deviate from the prototypic ePKs 
in two respects. First, they lack a basic amino acid 
residue in subdomain VIb, the catalytic loop, which 
distinguishes the serine/threonine- and tyrosine-spe- 
cific forms of classic ePKs. Second, their C-terminal 
protein-substrate-binding domain is shorter and lacks 
several of the structural features conserved in classic 
ePKs (167-169). 

Phylogenetic analyses indicate that the RIO/ 
piD261 subfamilies formed the progenitors of the 


classic ePKs (10, 81, 179). It is thus somewhat ironic 
that the former have been dubbed “atypical” ePKs 
(197). It has been hypothesized that the classic ePK 
paradigm originated in the Eucarya, wherein it un- 
derwent explosive expansion and from which it was 
acquired by the Bacteria by lateral gene transfer 
(179). Note that the Archaea are not completely de- 
void of proteins resembling classic ePKs, as was ini- 
tially believed. S. solfataricus, for example, contains 
at least two open reading frames (ORFs) encoding 
protein kinases that resemble classic ePKs (138). 


Riol and Rio2 


The ORFs encoding five archaeal ePKs have been 
cloned and their protein products characterized: Rio1 
(169) and Rio2 (167, 168) from A. fulgidus, PH0512 
from P. horikoshii (290), and SsoPK2 (190) and 
SsoPK3 (191) from S. solfataricus. Both Riol and 
Rio2 catalyze their own phosphorylation, as well as 
that of several exogenous proteins such as histone H1 
and myelin basic protein. In addition, Rio1 transpho- 
sphorylates itself (169). X-ray crystallographic analy- 
sis of Rio2 revealed the presence of an N-terminal 
winged helix-turn-helix motif of the type typically as- 
sociated with DNA-binding proteins (167). No such 
domain is present in Rio1. It is tempting to speculate 
that the winged helix-turn-helix domain of Rio2 tar- 
gets this enzyme to the components of a protein- 
oligonucleotide complex in vivo (170, 171). 


SsoPK2 and SsoPK3 


Similar to Rio1 and Rio2, SsoPK2 from S. solfa- 
taricus catalyzes its autophosphorylation as well as 
the phosphorylation of exogenous proteins such as 
casein, myelin basic protein, mixed histones, and a 
chemically modified form of lysozyme in vitro (190). 
In every instance, the enzyme targeted serine residues 
for modification. The rate at which these proteins 
were phosphorylated was very low, which raises ques- 
tions as to whether proteins serve as the physiologic 
target of this phosphotransferase. SsoPK2 contains a 
large, 300-residue, N-terminal domain of unknown 
function. SsoPK3, which was isolated as a phospho- 
protein present in membrane extracts of S. solfataricus, 
also possesses a large (180 residue) N-terminal do- 
main of undetermined function, as well as a poten- 
tial leucine zipper near the C terminus (191). While 
SsoPK3 displayed protein-serine/threonine kinase ac- 
tivity toward exogenous proteins such as casein, 
bovine serum albumin, myelin basic protein, and chem- 
ically modified lysozyme, it exhibited no propensity 
to autophosphorylate in vitro. 
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PH0512 


PH0512 from P. horikoshii was identified as 
a potential homolog of the double-stranded, RNA-ac- 
tivated protein kinase that phosphorylates eucaryal 
initiation factor 2-a in the Eucarya (290). The re- 
combinant protein product phosphorylated archaeal 
initiation factor 2-a in vitro. The site of phosphory- 
lation, Ser-48, corresponds to the site of phosphory- 
lation of its eucaryal homolog, Ser-51. PH0152 ap- 
pears to be the first archaeal ePK for which a cognate 
substrate has been identified. 


Regulation of archaeal protein kinases 


Little is known about the mechanisms by which 
archaeal protein kinases are regulated. However, most 
archaea contain at least one deduced ePK that con- 
tains a predicted transmembrane domain, implicating 
them as components of transmembrane receptors (95). 
In H. volcanii, expression of mRNA encoding a de- 
duced ePK is responsive to changes in environmental 
salinity (28). SsoPK3 from S. solfataricus is phospho- 
rylated by a second membrane-associated protein 
kinase (191), most likely a glycosylated, threonine- 
preferring protein kinase (188, 189). It has not been 
determined whether phosphorylation of SsoPK3 
affects its catalytic properties. 


Protein-Serine/Threonine/Tyrosine Phosphatases 


ORFs representing four distinct families of protein- 
serine/threonine/tyrosine phosphatases have been dis- 
cerned in archaeal genomes. They are the PPP-family 
protein phosphatases, the PPM-family protein phos- 
phatases, the conventional protein-tyrosine phos- 
phatases (cPTP), and the low-molecular-weight protein- 
tyrosine phosphatases (LMW-PTP) (136, 138). 


PPP family 


The eucaryal representatives of the PPP family, 
which include protein phosphatase-1, protein phos- 
phatase-2A, and calcineurin, are serine/threonine- 
specific metalloenzymes (23). By contrast, the bacter- 
ial members of the PPP family are highly promiscuous, 
hydrolyzing protein-bound phosphoserine, phospho- 
threonine, phosphotyrosine, and phosphohistidine 
(136, 138). Most, but not all bacterial PPPs, require 
the presence of exogenous divalent metal ions for ac- 
tivity (136). Three members of the PPP family have 
been characterized from the archaeal organisms: PP 1- 
arch1 from S. solfataricus (178), PP1-arch2 from 
Methanosarcina thermophila TM-1 (274), and Py- 
PP1 from M. abyssi (195). Each shares 25 to 30% se- 


quence identity with eucaryal PPPs, and 15 to 19% 
identity with bacterial PPPs. Similar to their eucaryal 
counterparts, they are serine/threonine specific. How- 
ever, all three require the addition of a divalent metal 
ion, usually Mn?*, for activity. 


PPM family 


The PPM family of protein phosphatases is 
found in about half of the Bacteria and all members 
of the Eucarya (23, 136). All forms of the enzyme 
characterized to date require the presence of an ex- 
ogenous divalent metal ion for activity. The vast ma- 
jority are serine/threonine specific. However, forms 
that also hydrolyze protein-bound phosphotyrosine 
in vitro have recently been discovered in the cyano- 
bacteria (181, 252). Among the nearly two-dozen ar- 
chaeal genome sequences released to date, only one 
ORF encoding a member of the PPM family of diva- 
lent metal ion-dependent protein-serine/threonine 
phosphatases has been detected: TVN0703 from 
T. volcanium (138). The gene encoding this protein 
phosphatase was presumably acquired via lateral 
transfer from a bacterium. The protein product of 
this gene has not been characterized. 


Protein-tyrosine phosphatases 


Both the cPTPs and LMW-PTPs are found 
throughout the Eucarya and in many of the Bacteria 
(137, 268). All known LMW-PTPs are tyrosine spe- 
cific. On the other hand, the cPTPs have diverged into 
several subfamilies. Some, most notably the VH1 
famly, are dual specific, i.e., they hydrolyze protein- 
bound phosphoserine/phosphothreonine and phos- 
photyrosine residues (5). Only one archaeal PTP, Tk- 
PTP from T. kodakaraensis, has been characterized 
(128). Tk-PTP is a member of the VH1-like branch of 
the cPTPs (5) and exhibits dual-specific phosphatase 
activity in vitro (128). 


Distribution “pattern” 


The heterogeneous distribution of these various 
protein phosphatases complicates the assignment of 
specific roles to particular enzymes or enzyme fami- 
lies. While at least one ORF encoding a deduced pro- 
tein-serine/threonine/tyrosine phosphatase is present 
in every archaeon that possesses an ePK, no one fam- 
ily is found in all archaea (138). The most widely dis- 
tributed families are the PPPs and cPTPs, each of 
which appears in roughly one-half of the Archaea. 
Phylogenetic analysis indicates that the PPPs (136) 
and, possibly, cPTPs (5) were probably present in the 
last universal common ancestor. If this is the case, 
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then some archaea lost PPP and/or cPTP genes through 
the course of evolution. 

Archaeal genomes display a striking imbalance 
between the number of deduced ePKs and counter- 
vailing protein phosphatases (138). In particular, in 
the archaea that contain four or more ePKs, the num- 
ber of recognizable protein phosphatases is in gen- 
eral one-half, or less, of the number of deduced pro- 
tein kinases. S. solfataricus, for example, may contain 
as many as eight ePKs but possesses only two pro- 
tein phosphatases: a PPP and a cPTP. The ratio in 
P. horikoshii is four potential ePKs to one cPTP. This 
apparent imbalance suggests either that (i) the Archaea 
contain one or more families of yet-to-be-identified 
protein phosphatases, (ii) archaeal protein phos- 
phatases serve as a largely static pool of protein phos- 
phatase activity, deferring modulation of the phospho- 
rylation state of individual proteins to the protein 
kinase, or (iii) the substrate specificity of archaeal pro- 
tein phosphatases is modulated via control of their spa- 
tial distribution rather than their gross catalytic activ- 
ity. This latter mode of regulation is employed by many 
eucaryal members of the PPP-family (55). 


TWO-COMPONENT SYSTEM 


The term two-component, or His-Asp phospho- 
relay, system refers to a set of conserved phospho- 
transfer domains that are employed in modular fash- 
ion to construct signal transmission cascades linking 
a wide range of sensors to their intracellular targets 
(Fig. 4) (80, 229, 284). The ubiquitous core modules 
for these cascades consist a pair of partner phospho- 
transferases known as histidine kinases (103) and re- 
sponse regulators (52, 283). Upon stimulation of an 
associated receptor, a histidine kinase autophosphory- 
lates on a conserved histidine residue. The resulting 
phosphoprotein serves as the specific phosphodonor 
substrate for its partner-response regulator, which au- 
tophosphorylates on a conserved aspartate residue. 
The autophosphorylated response regulator, in turn, 
triggers an appropriate cellular response either through 
interaction with a target protein, such as a flagellar 
motor (37), or through activation of a fused “output” 
domain, usually a DNA-binding domain or enzyme 
(283, 284). Many response regulators can employ 
acetyl phosphate as an alternative phosphodonor sub- 
strate in vitro. Consequently, it has been postulated 
that some organisms utilize acetyl phosphatase as an 
indicator metabolite, dubbed an “acetate switch,” that 
acts, at least in part, by fueling the autophosphoryla- 
tion of select response regulator proteins (319). 

Two-component systems differ in several funda- 
mental respects from protein-serine/threonine/tyrosine 


phosphorylation cascades. First, autophosohorylation 
is the predominant mechanism of phosphorylation in 
the two-component system, whereas protein-serine/ 
threonine/tyrosine phosphorylation cascades rely pri- 
marily on phosphotransfer reactions catalyzed by 
protein kinases that are distinct from the phosphoac- 
ceptor protein. This reliance of autophosphorylation 
imposes greater constraints on the proteins that ulti- 
mately can be targeted for regulation by this mecha- 
nism than does phosphorylation by an exogenous cat- 
alyst. Second, the chemical nature of the phosphoryl 
moieties formed during two-component signaling dif- 
fers significantly from that of protein-serine/threonine/ 
tyrosine phosphorylation. The phosphoramide and 
mixed acid anhydride bonds of phosphohistidine and 
phosphoaspartate are significantly higher in energy 
than the phosphoester bonds of phosphoserine, phos- 
phothreonine, and phosphotyrosine (46, 52, 91, 315). 
Therefore, the thermodynamic barrier to the “reverse 
reaction, i.e., the transfer of the phosphoryl group 
from the phosphoaspartate of a response regulator 
to the histidine of a partner phosphodonor protein, 
is relatively low. 

The facility with which phosphoryl groups can be 
transferred in either direction between histidine and 
aspartate residues has been exploited to expand on the 
binary architecture of archetypical two-component 
signal transduction systems. These more extensive cas- 
cades often employ a third conserved module called a 
histidine phosphotransfer, or Hpt, domain. Hpt do- 
mains shuttle phosphoryl groups between response 
regulator domains, thus facilitating the construction 
of extended, oftentimes branched, two-component 
cascades (11, 114). Although it has been speculated 
that Hpt domains catalyze the phosphotransfer reac- 
tions in which they participate, they exhibit neither ki- 
nase nor phosphatase activity in isolation (229, 284). 
It therefore seems likely that phosphotransfer to and 
from the phosphoacceptor histidine of Hpt proteins 
is catalyzed by the autophosphorylation activity of 
their cognate response regulator domains. In this sce- 
nario, the phosphohistidine-containing Hpt domain 
functions as phosphodonor substrate in a manner 
analogous to an autophosphorylated histidine kinase. 
Phosphorylation of Hpt domains by autophosphory- 
lated response regulators would take place via the sim- 
ple reversal of this reaction. 


Distribution of Two-Component Domains 
among the Archaea 


The two-component system appears to have orig- 
inated in the Bacteria, from which it radiated into mem- 
bers of the other phylogenetic domains (142, 156). An 
examination of the current library of archaeal genome 
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Cellular responses 


Figure 4. Examples of typical architectures of two-component signal transduction cascades. Shown are schematic representa- 
tions of three hypothetical two-component signal transduction cascades. For each example, an external signal (open dia- 
mond) activates a histidine kinase (hatched oval) by binding to its transmembrane receptor domain (open pentagon). Re- 
sponse regulator domains are represented as diagonally striped rectangles. Output domains (filled circles and filled hexagons); 
Hpt domains (open triangles); phosphoryl transfer events (hatched arrows); conserved histidine (H) and conserved aspartate 
(D) residues within each two-component domain. A basic two-component signaling cascade (left); an extended two-component 
cascade employing a hybrid histidine kinase, i.e., one that is fused to a response regulator domain, and phosphoryl shuttle via 
an Hpt domain (middle); a branched two-component cascade whose right-hand branch includes an response regulator domain- 
Hpt domain fusion protein that serves as a phosphoryl group shuttle bridging the histidine kinase to a downstream target 


response regulator protein (right). 


sequences reveals that more than half contain open 
reading frames encoding deduced histidine kinase and 
response regulator domains (Table 5). However, the 
distribution is strikingly skewed. While the majority 
of the Euryarchaeota possess deduced two-compo- 
nent systems, both the Nanoarchaeota and Crenar- 
chaeota are devoid of histidine kinase and response 
regulator domains (18, 19), 

Speculation as to the cause(s) of this polarized 
distribution pattern, such as transfer to an ancient eu- 


ryarchaeon (19), may be premature, as both the 
Crenarchaeota and Nanoarchaeota are poorly repre- 
sented in the current collection of genome sequences. 
In addition, it may be noteworthy that the four cre- 
narchaeal and sole nanoarchaeal exemplars are all 
thermophiles, as the possession of two-component sys- 
tems within the Euryarchaeota displays a rough cor- 
relation with growth temperature (Table 5). Specifically, 
nine of the ten mesophiles and the sole psychrophile 
possess multiple deduced two-component systems. By 
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Table 5. Distribution of ORFs encoding potential two-component signal transduction domains within genomes of Archaea 


Archaeon Phylum T? HK? (reference) RR (reference) Hpt 
Aeropyrum pernix C 95 None (2, 3, 5) None (3-5) None (3) 
Archaeoglobus fulgidus E 83 14 (1, 3, 5), 15 (2), 23 (6), 21 (7) 11 (1, 3, 4,7) 1 (3), none (6) 
Ferroplasma acidarmanus E 40 None (7) None (7) n.d. 
Halobacterium NRC-1 E 37 10 (6), 12 (7), 14 (3, 8) 4 (4), 6 (8), 7 (7) 1 (3), none (6) 
Halobacterium salinarium E 37 13 (5) =2 (R & O 1996) n.d. 
Haloarcula marismortui E 37 59 (5), 61 (7) 43 (7) n.d. 
Methanothermobacter E 65 16 (1, 5), 15 (2, 7) 8 (3), 9 (4), 10 (1), 11 (7) None (3) 

thermautotrophicus 
Methanococcoides burtonii E 23 35 (7) 15 (7) n.d. 
Methanococcus jannaschii E 85 None (1, 2, 3, 5, 7) None (1, 3, 4, 7) None (3) 
Methanococcus maripaludis E 37 3(95, 7) 3 (7) n.d. 
Methanosarcina acetivorans E 37 64 (6), 52 (7) 19 (7), 20 (4) 1 (6) 
Methanosarcina barkeri E 37 26 (7) 13 (4, 7) n.d. 
Methanosarcina mazei E 37 28 (6), 31 (7), 33 (5) 17 (4, 7) None (6) 
Methanopyrus kandleri E 110 None (7) None (4, 7) n.d. 
Methanospirillum hungatei E 37 39 (7) 78 (7) n.d 
Nanoarchaeum equitans N 90 None (5, 7) None (7) n.d 
Natronomonas pharaonis E 20 33 (7) 20 (7) n.d. 
Picrophilus torridus E 50 None (5) n.d. 
Pyrobaculum aerophilum C 100 None (5) None (4) n.d. 
Pyrococcus abyssi E 96 L3) 1 (3), 2 (4, 7) None (3) 
Pyrococcus furiosus E 96 None (94) None (4) n.d. 
Pyrococcus horikoshii E 98 Lilly. 2,.35:55:7) 1 (1, 3), 2 (4, 7) 1 (3) 
Sulfolobus solfataricus C 80 None (5) None (4) n.d. 
Sulfolobus tokodaii C 80 None (5) None (4) n.d. 
Thermococcus kodakaraensis E 102 1 (7) 2 (7) n.d. 
Thermoplasma acidophilum E 59 None (1, 3, 5, 7) None (1, 3, 4, 7) None (3) 
Thermoplasma volcanium E 60 None (5, 7) None (4, 7) n.d. 
“Temperature optimum in degrees Celsius as reported in reference 19 with the exception of M. burtonii, which is 23°C (100). 


’ Abbreviations used: C, Crenarchaeota; E, Euryarchaeota; N, Nanoarchaeota; HK, histidine kinase domain; RR, response regulator domain; Hpt, histidine 


phosphotransfer domain; n.d., not determined. 


contrast, only five of the eleven thermophilic members 
of the Euryarchaeota possess deduced two-component 
domains. Thus, two-component systems may yet be 
found among undiscovered mesophilic members of the 
Crenarchaeota and Nanoarchaeota. 

Only a handful of deduced Hpt domains have 
been identified in the Archaea (Table 5). In every 
instance, they are found in Euryarchaeota that pos- 
sess deduced histidine kinase and response regulator 
domains. 


Physiological Roles of Archaeal 
Two-Component Systems 


One prominent function of archaeal two-compo- 
nent regulatory systems is to transmit the signals that 
guide chemo-, photo-, and/or aerotactic behavior (see 
Chapter 18). Homologs of the proteins that comprise 
the core of the bacterial signal transduction cascades 
responsible for guiding chemotaxis are encoded in the 
genomes of thirteen members of the Euryarchaeota 
(19, 95). These core units include MCPs, the CheA 


histidine kinase, and two response regulators: CheY, 
which modulates flagellar motor function, and CheB, 
whose methylesterase domain modulates the sensitiv- 
ity of the MCPs (20, 289). The predicted functional 
properties of each of these archaeal homologs have 
been verified using the genetically malleable halo- 
archaeon H. salinarium (see Chapter 21) as a model 
(249-251, 287). 

Probable output domains have been identified in 
a handful of other deduced archaeal response regula- 
tors. These include DNA-binding domains suggestive 
of transcriptional regulators in M. burtonii (100) and 
H. marismortui (19), and glycosyltransferase do- 
mains in M. thermautotrophicus (19). However, the 
vast majority of archaeal response regulator domains 
fall into the “one-domain” category; i.e., they lack a 
large, fused output domain (18, 19). Such one-domain 
regulators are presumed to act on cellular targets via 
protein-protein interactions, as is the case with the 
prototypic one-domain regulator Chey, or to serve 
as intermediates that shuttle phosphoryl groups in an 
extended two-component cascade (19). 


244 KENNELLY 


Dephosphorylation of Response Regulator Domains 


While no archaeal protein has been directly 
demonstrated to exhibit protein-histidine or protein- 
aspartate phosphatase activity, archaeal genomes en- 
code several polypeptides analogous to those impli- 
cated as two-component phosphatases in the Bacteria. 
These include the phosphohistidine phosphatase SixA 
(207) and the chemotaxis-specific phosphoaspartate 
phosphatase CheC (288). Since SixA has been found 
only in a few crenarchaea, which themselves are devoid 
of two-component systems, the archaeal version of this 
enzyme presumably plays a role in metabolism rather 
than in sensor-response processes. On the other hand, 
open reading frames encoding deduced CheC phos- 
phatases have been identified in many of the same eur- 
yarchaea in which two-component systems reside (e.g., 
A. fulgidus, Halobacterium NRC-1, H. salinarium, 
M. acetivorans, M. mazei, M. maripaludis, N. pharaonis, 
P. abyssi, P. horikoshii, and T. kodakaraensis [146]), 
but not in other euryarchaea or the Nanoarchaeota 
and Crenarchaeota. This correlation strongly impli- 
cates CheC as a prime catalyst for the dephosphoryla- 
tion of those response regulator domains that partici- 
pate in aero-, chemo-, and phototactic sensor-response 
processes. No archaeal homologs of the regulator as- 
partyl-phosphate, RAP, phosphatases (238) have been 
identified. 

For those two-component systems not involved 
in mediating archaeal taxis, the most likely sources of 
protein phosphatase activity are the histidine kinase 
and response regulator domains themselves. Many 
bacterial response regulators possess intrinsic au- 
tophosphatase activity that often is enhanced through 
their association with other proteins (284). Notably, 
one of the best known of these enhancers, the Chez 
“phosphatase” (225) of the B- and y-proteobacteria 
(326), is absent from the Archaea despite the perva- 
siveness of other Che proteins (156). Several histidine 
kinases also have been revealed to be bifunctional; i.e., 
they can hydrolyze the phosphoaspartyl moieties of 
their cognate-response regulators (284). Deletion of 
the autophosphorylated histidine residue of the bac- 
terial EnvZ histidine kinase abrogates phosphotrans- 
fer but not protein-aspartate phosphatase activity, in- 
dicating that dephosphorylation takes place via a 
direct hydrolytic mechanism rather than via a “reverse 
kinase” mechanism (119, 270). 


OTHER COVALENT MODIFICATIONS 
Archaeal proteins are the targets for a wide range 


of posttranslational modifications (74). Several of 
these appear to be structural in nature (e.g., glycosy- 


lation, fatty acylation, isoprenylation, and prolyl cis- 
trans isomerization) and are not discussed in this 
chapter. In addition, certain archaeal proteins contain 
the atypical amino acids selenocysteine (132) or 
pyrrolysine (108, 273) (see Chapter 9). Selenocysteine 
is commonly found in archaeal redox enzymes (317), 
while pyrrolysine is restricted to methylamine methyl- 
transferases, where it is speculated to serve as a cat- 
alytic electrophile (273). While the scarcity of these 
amino acids imbues them with a functional resem- 
blance to the modified amino acids produced via 
posttranslational modification, both selenocysteine 
(317) and pyrrolysine (157) are incorporated into the 
nascent polypeptide chain during translation from 
charged tRNA precursors. They are thus frequently 
referred to as the 21st and 22nd genetically encoded 
amino acids, respectively. 

The posttranslational modifications discussed 
below were highlighted because they have been 
demonstrated, at least in some instances, to modu- 
late the function of one or more target proteins from 
the Archaea or other organisms. 


Acetylation 


Two forms of protein acetylation have been re- 
ported in the Archaea. The first is acetylation of the 
amino terminus of ribosomal protein HmaS7 from 
the haloarchaeon H. marismortui, which presumably 
occurs as a structural processing event (150). The sec- 
ond is the acetylation of the side-chain amino group 
of Alba (26), a broad-specificity DNA-binding pro- 
tein found in thermophilic archaea and some eucarya 
(316). Alba and its relatives are thought to perform 
roles analogous to histones in archaeal chromatin (see 
Chapters 3 and 4). Acetylation of Alba decreases its 
affinity for DNA ~40-fold, blunting its ability to re- 
press transcription in a reconstituted system (26). 

Two distinct mechanisms have been proposed to 
account for the effects of lysine acetylation upon 
Alba’s affinity for DNA. Using the three-dimensional 
structure of Alba from S. solfataricus, models of 
Alba-DNA complexes were constructed and showed 
that the modified lysine residue, Lys-16, resided in an 
area of direct protein-DNA contact (312). Acetyla- 
tion would be expected to directly interfere with pro- 
tein-DNA binding by introducing steric hindrance 
and eliminating the potential for protonation, which 
would introduce a positive charge. Zhao et al. (325), 
on the other hand, observed that mutation of the 
acetyl-acceptor lysine in Alba from A. fulgidus dis- 
rupts homooligomerization, a prerequisite for high- 
affinity binding to DNA. 

The modulation of DNA-protein interaction by 
the acetylation of lysine residues is reminiscent of the 
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situation in the Eucarya, where acetylation is one of a 
cadre of reversible covalent modifications that mod- 
ulate chromatin structure by targeting broad-speci- 
ficity DNA-binding proteins, such as histones (58, 
104). This parallel was further illustrated by the ob- 
servation that the enzyme responsible for catalyzing 
the deacetylation of acetyl-Alba is a homolog of the 
eucaryal histone deacetylase, Sir2 (26). However, sev- 
eral archaeal genomes that encode Alba lack a recog- 
nizable Sir2 deacetylase, suggesting that the “histone 
code” is not employed by every archaeon (88, 316). 
Moreover, the recently identified Alba acetyltrans- 
ferase, Pat, is a homolog of a bacterial enzyme and 
not a eucaryal histone acetylase (200). Thus, despite 
employing some common components, the archaeal 
and eucaryal chromatin acetylation-deacetylation sys- 
tems may have developed independently (200). 


Deamidation 


In certain MCPs, sites of regulatory methylation 
(see “Methylation,” below) are created posttransla- 
tionally by the enzymatic deamidation of glutamine 
side chains (141). In the bacterium Bacillus subtilis, 
the source of this glutamine deamidase activity has 
been traced to the product of the gene cheD (141). 
Homologs of CheD have been identified in the genome 
sequences of several archaea, including A. fulgidus, 
H. salinarium, P. abyssi, and P. horikoshii (146). 


Diphthamidation 


In common with their eucaryal homologs, a con- 
served histidine in elongation factor 2 (EF-2) is cova- 
lently modified to form 2-(3-carboxyamide-3-(tri- 
methylammonio)propyl) histidine, commonly referred 
to as diphthamide (223). This covalent modification 
appears to be universal among the Archaea, as all ver- 
sions of EF-2 tested to date are sensitive to the ADP- 
ribosylating activity of the diphtheria toxin, which 
specifically targets the conserved diphthamide residue 
(139). The physiologic role of the diphthamide 
residue in archaeal EF-2 is not known. An unmodified 
protein was functionally competent when assayed in 
vitro (66), and archaeal EF-2 does not appear to be 
ADP ribosylated inside the cell. 


Disulfide Formation 


Reversible changes in protein structure and func- 
tion induced by the formation and reduction of disul- 
fide bonds provide a logical means for exerting redox- 
mediated control of cellular processes. Behavior 
suggestive of such a control mechanism has been re- 
ported for four archaeal proteins: coenzyme F399 


hydrolase from M. thermautotrophicus (307), Ay 
ATPase from M. mazei (177), inositol monophosphatase/ 
fructose bisphosphatase from A. fulgidus (282), and 
ferredoxin from P. furiosus (266). In each of the first 
three proteins, reduction of the critical disulfide bond 
markedly enhances its catalytic activity. However, no 
functional role for the reducible disulfide bond in the 
archaeal ferredoxin has been elucidated. The activity 
of coenzyme F399 synthetase, whose activity opposes 
that of the coenzyme F399 hydrolase, is also subject to 
redox regulation. However, in the case of the syn- 
thetase, regulation is mediated through changes in 
the redox state of one of its substrates, coenzyme 
F429, and not via modification of the protein cata- 
lyst. Specifically, the reduced form of coenzyme F420, 
1,5-dihydro- of coenzyme F420, is a potent competi- 
tive inhibitor of the synthetase (306). Thus, under re- 
ducing conditions where coenzyme F399 hydrolase is 
activated, the accumulation of the reduced coenzyme 
F420 results in the concomitant inhibition of the hy- 
drolase (Fig. 5). 


Hypusination 


In both the Archaea and Eucarya, a conserved 
lysine residue in initiation factor-5A (IF-5A) under- 
goes a unique covalent modification to form N-e- 
(4-aminobutyl-2-hydroxy) lysine, commonly referred 
to as hypusine (24). Modification takes place in two 
steps that are catalyzed by the enzymes deoxyhypu- 
sine synthase and deoxyhypusine hydroxylase (41). 
Hypusination is essential in the Archaea, as inhibition 
of deoxyhypusine synthase leads to growth arrest in 
H. halobium, Haloferax mediterranei, S. acidocaldar- 
ius, and S. solfataricus (125). It has been speculated, 
based on an inspection of the three-dimensional 
structure of archaeal IF-5A, that hypusine directly 
participates in the binding of mRNA (324). 


Methylation 


A wide range of archaeal proteins is subject to 
covalent modification by methylation (Table 6). 
Prominent among them are MCPs associated with 
many of the receptors that mediate aero-, chemo-, 
and phototaxis via two-component signaling cas- 
cades. The methylation of y-carboxylate group of 
glutamate in MCPs, forming the corresponding 
methyl ester, is a dynamic process modulated through 
the opposing catalytic actions of methyltransferases 
and methylesterases (37, 289). The activity of these 
enzymes is modulated by the MCP’s cognate histidine 
kinase in such a manner that stimulation of the sig- 
nal transduction cascade generally promotes methy- 
lation (43, 153, 183, 228, 260, 280). MCP methyla- 
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Figure 5. Redox regulation of coenzyme F399 metabolism in M. thermautotrophicus. Enzyme names are italicized. Events that 
stimulate (plus sign) or inhibit (minus sign) enzymatic activity; cysteine sulfhydryl groups (SH); cystine disulfides (S—S). 


tion modulates the sensitivity of the receptor complex 
to stimuli, thereby enabling cells to recalibrate their 
sensor-response machinery as they progress along 
a gradient of attractant or repellant (2, 250, 287). 
While some MCPs contain a single site of methyla- 
tion, others extend the dynamic range of this adaptive 
mechanism through the utilization of multiple methy- 
lation sites whose modification shifts sensitivity in a 
progressive manner (205). 

Other archaeal methylation events target the 
-amino group of lysine (Table 6). Subjects of lysine 
methylation in the archaeon S. solfataricus include 
the DNA-binding proteins Sso7c (220) and Sso7d 
(25). In addition, a lysine methyltransferase with ac- 
tivity toward the putative chromatin protein MC1-a 
has recently been identified in M. mazei (198). Methy- 
lation of archaeal DNA-binding proteins may per- 


form a similar role in modulating chromatin structure 
in the Archaea as does histone methylation in the 
Eucarya. However, only minor differences were re- 
ported for the thermodynamics of DNA binding by 
methylated and nonmethylated versions of Sso7d 
(192). Alternatively, as the Sso7d protein has a chro- 
modomain-fold that contains a methyllysine recogni- 
tion motif, methylation may facilitate homooligomer- 
ization (316). 

It has been speculated that N-methylation of 
lysine may enhance the thermostability of proteins in 
hyperthermophilic archaea (199). This certainly ap- 
pears to be the case for the B-glycosidase from S. sol- 
fataricus (85). On the other hand, despite the fact that 
the degree of methylation of lysine residues in Sso7d 
increases in concert with growth temperature (25), 
methylation does not noticeably improve the thermal 
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Table 6. Methylated proteins from the Archaea’ 


Protein Archaeon Amino acid(s) Reference 

MCPs 

BasT H. salinarium O-y-methylglutamate 154 

Car H. salinarium O-y-methylglutamate 285 

Htrl and Htrll H. salinarium O-y-methylglutamate 228 

HtrVII and HtrXI H. salinarium O-y-methylglutamate 43 

Htr VIII H salinarium O-y-methylglutamate 42 

MpcT/Htr14 H. salinarium O-y-methylglutamate 152 
Other proteins 

50S ribosomal proteins HL3, HL, H. cutirubrum ND? 6 

HL10, HL11, HL14 
Methyl-CoM reductase (a subunit) M. thermautotrophicus 2-(S)-methylglutamine, 264 


Methyl-CoM reductase (a subunit) M. barkeri 


5-(S)-methylarginine, 
1-N-methylhistidine, 
S-methylcysteine 
5-(S)-methylarginine, 76 
1-N-methylhistidine, 
S-methylcysteine 


P, ribonuclease S. solfataricus N-e-methyllysine 92 
P; ribonuclease S. solfataricus N-e-methyllysine 93 
Glutamate dehydrogenase S. solfataricus N-e-methyllysine 199 
B-Glycosidase S. solfataricus N-e-methyllysine 85 
Ferredoxin Sulfolobus sp. strain 7 N-e-methyllysine 310 
Sso7c S. solfataricus N-e-methyllysine 220 
Sso7d S. solfataricus N-e-methyllysine 25 


“Included are proteins for which the nature of methyl-protein bond has been directly defined as well as proteins observed to incorporate radioactive methyl 


groups. 
bND, not determined. 


stability of this protein in vitro (151). Similarly, 
methylation did not significantly alter either the ther- 
mal stability or catalytic properties of the P2 (92) or 
P3 (93) ribonucleases from S. solfataricus. 

The methyl-coenzyme M reductases from M. bar- 
keri and M. thermautotrophicus contain a multiplic- 
ity of methylated amino acids, including 1-N-methyl- 
histidine, S-methylcysteine, 5-(S)-methylarginine, and 
2-(S)-methylglutamine (101). These methylated amino 
acids, along with another unusual modified amino 
acid, thioglycine, cluster around the enzyme’s active 
site. Their proximity to the site of catalysis suggests 
that these modifications play an important structural 
and/or functional role(s) in the enzyme. 2-(S)-Methyl- 
glutamine is unusual as the addition of the methyl 
group involves the formation of a C—C bond, rather 
than the C—O, C—N, and C—S bonds encountered 
in other methylated amino acids. 

Another form of protein methylation encoun- 
tered in the Archaea is associated with the repair of 
damaged proteins. The aberrant amino acid B- or 
isoaspartate is formed when the side chains of aspar- 
tate and asparagine undergo nucleophilic attack by 
the amido nitrogen of the adjacent peptide bond. Hy- 
drolysis of the resulting cyclic succinimide can take 
place with release of either the side-chain carboxylate, 


yielding L-aspartate, or the a-carboxylate, yielding 
isoaspartate. Methylation of the a-carboxylate by a 
repair methyltransferase and S-adenosylmethionine 
provides a thermodynamically accessible route for 
reforming the cyclic succinimide and, subsequently, 
L-aspartate. By contrast to the repair methyltrans- 
ferases from the Eucarya and Bacteria, the isoaspar- 
tate methyltransferase from the hyperthermophilic 
archaeon P. furiosus targets D-aspartate as well as 
isoaspartate in peptide substrates (105, 294). Forma- 
tion, and hence repair, of the latter also proceeds via 
formation of the same cyclic succinimide intermediate 
as isoaspartate. 


Poly(ADP-ribosyl)ation 


In Eucarya, poly(ADP-ribose) polymerases use 
the dinucleotide NAD* as a precursor for the synthe- 
sis of protein-bound polymers of ADP-ribose that can 
reach 200 units in length. This posttranslational mod- 
ification predominantly targets nuclear proteins such 
as histones, as well as the poly(ADP-ribose) poly- 
merase itself (60, 67). Poly(ADP-ribose) polymerase 
activity has been detected in the hyperthermophilic 
archaeon S. solfataricus (82). The protein substrates 
of this activity include the poly(ADP-ribose) poly- 
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merase itself (60) as well as a small basic DNA-bind- 
ing protein (82). In common with its eucaryal coun- 
terparts, the poly(ADP-ribose) polymerase from S. sol- 
fataricus interacts with DNA (83). Single-stranded 
DNA moderately stimulates the activity of the ar- 
chaeal enzyme, while double stranded DNA had lit- 
tle or no effect (84). 

It is widely presumed that (poly) ADP-ribosyla- 
tion regulates the functional properties of proteins, as 
is the case with other covalent modifications such as 
protein phosphorylation-dephosphorylation (60, 67). 
However, many fundamental questions concerning 
the regulation of proteins by poly(ADP-ribosyl)ation 
remain unanswered, such as whether this modifica- 
tion is reversible in vivo. It is thus of considerable in- 
terest that the genome of A. fulgidus encodes a pro- 
tein, AF1521, containing a MACRO domain (4, 135). 
Computational analysis indicates that MACRO do- 
mains form the catalytic core of ADP-ribose phos- 
phoesterases (13). Consistent with this prediction, 
AF1521 binds ADP-ribose with nanomolar affinity 
(4, 135) and displays detectable phosphohydrolase 
activity toward this nucleotide in vitro (135). 


Regulated Proteolysis 


Targeted, controlled proteolytic events can play 
important roles in determining when and where spe- 
cific proteins manifest their functional properties 
(73). Examination of archaeal genomes provides in- 
triguing, albeit circumstantial, evidence for regulated 
proteolysis in the Archaea. For example, ORFs en- 
coding deduced transmembrane proteases of the 
rhomboid family have been identified in several archaea 
(155). In addition to receptor-like proteases, some 
archaea encode putative metalloproteases containing 
CBS domains, which have been suggested to act as 
potential allosteric binding sites for small molecules 
(9). Another potential avenue for controlling protease 
activity is provided by archaeal serpins (122, 242), 
members of a family of highly target-specific inhibitors 
of serine- and cysteine-proteases. 

Several examples of archaeal zymogens have 
been reported. Not surprisingly, many of these in- 
volve proteolytic enzymes such as pyrolysin from 
P. furiosus (309), a chymotrypsinogen B-like enzyme 
from N. pharaonis (281), and the B-subunits of the 
archaeal 20S proteasome (328). Activation of the lat- 
ter is deferred until assembly of the proteasome (see 
Chapter 10), at which time a trans-autoproteolytic 
event unmasks latent catalytic activity (110). 

Zymogen activation has been observed for some 
nonproteases as well. Two pyruvoyl-dependent de- 
carboxylases from M. jannaschii, S-adenosylmethion- 
ine decarboxylase (143) and arginine decarboxylase 


(297), autocatalytically cleave larger precursor pro- 
teins into separate alpha and beta subunits. The cleav- 
age event generates the pyruvoyl moiety from the 
N-terminal serine of the alpha chain, which forms a 
Schiff’s base with substrates during catalysis. 


Ubiquitination 


In the Eucarya, modification by the covalent at- 
tachment of the small, 76-residue protein, ubiquitin, 
serves a variety of regulatory functions; the best 
known function is tagging proteins destined for 
degradation by the 26S proteasome (314). In 1993, a 
protein from the archaeon T. acidophilum was iso- 
lated whose N-terminal sequence resembled that of 
ubiquitin (318). More recently, ORFs encoding ubiq- 
uitin-like proteins have been detected in the genomes 
of A. fulgidus, Aquifex aeolicus, M. thermautotro- 
phicus, and T. volcanium (29). However, while pro- 
teins reactive with antibodies against ubiquitin have 
been detected in the haloarchaeon Natronococcus 
occultus (217), it is unclear how this modification is 
effected in the Archaea as no homologs of the eucaryal 
ubiquitin-conjugating system have been reported. It 
has been postulated that the nascent polypeptide-as- 
sociated complex from Methanothermobacter mar- 
burgensis, which contains an ubiquitin-associated do- 
main, may play some role in protein ubiquitination in 
the Archaea (276). 


PERSPECTIVE: THE NEXT FIVE YEARS 
Are the Archaea “Introspective”? 


The number of recognizable transmembrane re- 
ceptors in most archaea appears to be exceedingly 
small (95), in particular, if those receptors associated 
with chemotactic and phototactic sensor-response 
cascades are discounted. Moreover, the genomes of 
nonchemotactic Archaea encode a limited number 
and range of deduced signal transmission proteins of 
the types normally associated with transmembrane 
signal transduction, one or perhaps two adenylate 
cyclases, and a handful of protein-serine/threonine/ 
tyrosine kinases and phosphatases (95, 138). In con- 
trast, the number and variety of deduced cytoplasmic 
sensors appear to be much greater (95). In addition to 
a fairly wide range of allosteric domains that moni- 
tor the levels of key metabolites, the Archaea contain 
a significant number of proteins possessing cytoplas- 
mic receptor-like domains such as PAS domains, glo- 
bins, and CBS domains. The balance between extra- 
and intracellular sensors thus appears, at first glance, 
to be heavily biased toward internal sensors. This 
implies that many archaea are, as Galperin (95) has 
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coined, “introspective,” i.e., they respond to changes 
in their surroundings if and when they impact the sta- 
tus of some key intracellular molecules. 


Does “Introspective” Mean Simple? 


While internal sensors play an important role in 
cellular regulation, reliance on derivative, internal 
cues as the means of viewing and responding to 
changes in the surrounding environment would ap- 
pear to entail several significant drawbacks. These in- 
clude a limited ability to discriminate between differ- 
ent environmental variables and a significant time lag 
between the onset of an environmental change and its 
detection. The former imposes a corresponding limit 
on the range of available responses. Thus, the impact 
of a predominantly reactive strategy on an organism’s 
prospects for survival, at first glance, seems clear. 

So how and why have these “introspective” ar- 
chaea managed to survive for eons? One possible ex- 
planation is that the rudimentary nature of the sen- 
sor-response machinery found in many archaea is, in 
fact, appropriately scaled to survival in a specialized 
environmental niche. A basic and (relatively) sluggish 
suite of receptors might be sufficient to monitor and 
respond to environments characterized by changes 
that are limited in number and rate. It therefore may 
be noteworthy that archaea dwelling in less extreme 
and more cosmopolitan environments (e.g., the halo- 
archaea and certain methanogens) have acquired (or 
retained?) extensive suites of sensor-response machin- 
ery, presumably to cope with the demands of their 
more dynamic and competitive habitats. 

However, the bias toward internal sensing may 
reflect an adaptive response to the harsh nature of the 
surrounding environment. Extremes of heat, pH, and 
the combination thereof severely stress the chemical 
and conformational stability of proteins. Whenever 
practicable under these circumstances, sheltering sen- 
sors inside the protection of the cell membrane may 
afford significant advantages. The moderate and care- 
fully buffered pH of the cytoplasm minimizes the rate 
of deleterious hydrolytic reactions while providing 
unfettered access to the cell’s internal protein main- 
tenance and repair services. Species that readily dif- 
fuse through the membrane, such as oxygen or car- 
bon dioxide, are particularly well suited for monitoring 
by internal receptors, a fact reflected in the large 
number of internal PAS and globin domains. Other 
nutrients that are routinely transported into the cy- 
toplasm for assimilation can also be monitored inter- 
nally at little additional cost. 

However, it must be borne in mind that the abil- 
ity to gauge the degree to which the Archaea are “in- 
trospective” is predicated on the ability to identify 


internal and external sensors from genome sequence 
information. A consistent leitmotif of genomic analy- 
ses has been the recognition that the structure and 
function of a large proportion of the proteins encoded 
in living organisms remain unidentified. Presumably 
these include numerous archaeal sensors. The uncon- 
ventional nature of many archaeal habitats and the 
unique features of their metabolism certainly would 
be expected to stimulate the development of novel, 
and hence difficult to recognize, receptors. Moreover, 
the Archaea possess ample raw material for this pur- 
pose, as one in five archaeal proteins features a 
deduced transmembrane domain (12, 311). Lipid- 
anchored secreted proteins, many of which exhibit 
potential ligand binding domains, provide an addi- 
tional reservoir of potential external sensors (31). 
While future discoveries of new receptors may show 
the Archaea to be more outward looking than cur- 
rently estimated, the perception that they are gener- 
ally introspective is likely to be enduring. 


Insights into the Evolution of Protein Regulation 


It appears likely that allosteric feedback regula- 
tion represents nature’s most ancient mechanism for 
modulating protein function. Its elegant simplicity and 
apparent economy (by utilizing preexisting metabo- 
lites and protein domains) constitute compelling argu- 
ments for a seminal role in the pantheon of macro- 
molecular regulation. Existing metabolites serve as 
indicators of status, while allosteric “receptor” do- 
mains are fabricated via the duplication or fusion of 
extant genes. The discovery of riboswitches suggests 
that allosteric modulation of macromolecular function 
may predate the emergence of proteins (308). 

When did other regulatory mechanisms appear? 
What did the sensor-response network of early ar- 
chaea look like? Galperin (95) observes that the only 
signal transmission proteins that approach universal 
distribution among the Archaea are adenylate cyclase 
and members of the extended ePK family. It would 
therefore appear that cAMP represents a universal 
and extremely ancient second messenger (95). Pro- 
tein-serine/threonine/tyrosine phosphorylation also 
appears to predate the divergence of the three do- 
mains of life (138). Phylogenetic analyses suggest that 
the ePKs (233), PPP-family protein phosphatases (27, 
136), and, potentially, the cPTPs (5) and LMW PTPs 
(27) trace their lineage back to the last universal com- 
mon ancestor. 

The fact that protein phosphorylation (and de- 
phosphorylation) requires the development of a sep- 
arate and specialized enzyme species, the protein ki- 
nase (and phosphatase), would appear at first glance 
to argue that regulation by diffusible allosteric ligands 
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must predate this covalent modification by a consid- 
erable margin. By employing materials already on 
hand, allosterism would appear to impose minimal 
overhead costs. However, two points should be con- 
sidered in determining when these resultant mecha- 
nisms originated and evolved. First, the RNA-world 
model (51) indicates that phosphoester and phospho- 
ramide chemistry predates the emergence of proteins 
as the primary source of catalytic activity in cells. 
Thus, it is likely that phosphotransferases and phos- 
phoesterases were among the first protein catalysts 
to emerge. Second, it is relatively easy to propagate 
this mechanism once the specialized catalyst is at 
hand (321). 

If one accepts the premise that the phosphory- 
lation and dephosphorylation of proteins is an an- 
cient process, two questions emerge. First, given that 
virtually every archaeon possesses one or more ePKs, 
why is there so little consistency in the distribution 
of counterpart protein phosphatases among the 
Archaea? Second, why do so few archaea possess pro- 
totypic ePKs, as the preponderance of the prototypic 
form of ePK in the Eucarya indicates that it is supe- 
rior to the RIO and piD231 ePKs that dominate in 
the Archaea? 


Do Unique Sensor-Response Mechanisms Remain 
To Be Discovered in the Archaea? 


As the sensor-response machinery within the Ar- 
chaea is delved into in more depth, novel variations 
on established themes and completely unique mecha- 
nisms will undoubtedly be discovered. Many “or- 
phan” ligand-binding domains will be united with 
their signal transmission partners, and vice versa. Of 
particular interest will be the sensor-response mecha- 
nisms utilized by the Archaea for cell-cell communi- 
cation. Do the Archaea utilize secreted messengers (as 
occurs in bacterial quorum sensing) to modulate 
growth and coordinate activities with their neighbors 
in biofilms and other microbial consortia? Do they 
engage in contact-mediated detection? The presence 
of deduced extracellular proteins containing ho- 
mologs of domains found in some eucaryal cell sur- 
face proteins, e.g., polycystic kidney disease, B-pro- 
peller, and B-helix domains (130, 233), is suggestive 
of a role for contact-mediated communication. 

One potential explanation for the apparent im- 
balance between the number of protein-serine/threo- 
nine/tyrosine kinases and protein-serine/threonine/ 
tyrosine phosphatases that is typical of most archaea 
is the existence of one or more unrecognized families 
of protein-serine/threonine/tyrosine phosphatases in 
the Archaea. Likewise, the paucity of stereotypical 
second messengers in the Archaea suggests the pres- 


ence of novel species as well. Recently, both a novel 
family of protein-tyrosine phosphatases (213) and a 
new second messenger, cyclic di-GMP (61), were dis- 
covered in the Bacteria. Given that the members of 
the bacterial domain have been the subject of decades 
of intensive study, it appears highly likely that the Ar- 
chaea will not only be found to contain new sensor- 
response mechanisms and molecules, but that they 
will provide new insights into this vital process in 
other organisms as well. 
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Chapter 12 


Central Metabolism 


MICHAEL J. DANSON, HENRY J. LAMBLE, AND DAvip W. HOUGH 


INTRODUCTION 


Organisms utilizing organic nutrients do so to 
supply the precursors of all their cellular components 
and to generate energy for biosynthesis and other 
endergonic processes. The degradative pathways by 
which these nutrients are metabolized are known as 
“catabolic” routes, whereas the biosynthetic pathways 
are referred to as “anabolic” routes. The metabolic 
link between these two processes is provided by the 
pathways of central metabolism, the reactions of 
which also serve as the main energy-generating routes. 
Organisms growing autotrophically also have these 
same pathways of central metabolism, although the 
primary function in these cells is to provide biosyn- 
thetic precursors, while energy is produced photosyn- 
thetically or via chemolithotrophic reactions. 

The pathways of central metabolism are therefore 
at the heart of an organism’s total metabolic capacity, 
and their wide conservation suggests they were an early 
evolutionary invention. Consistent with this view are 
common themes that are found spanning the Archaea, 
Bacteria, and Eucarya, although variations are ob- 
served that reflect not only phylogeny but also partic- 
ular lifestyles and requirements. With this in mind, the 
principal aim of this chapter is to describe the central 
metabolic pathways of the Archaea and to identify the 
unique or unusual features of archaeal metabolism. 

Several reviews on archaeal central metabolism 
have been written previously. However, they have 
necessarily relied mainly on enzymological data, and 
what makes a new review important at this stage is 
that we are now able to integrate these data with in- 
formation from genome sequences. While taking this 
line, it will become clear that the currently-fashion- 
able “Systems Biology” approach depends on data in- 
put from both biochemistry and molecular biology; 
indeed, in various areas of central metabolism, gene 
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sequences, at best, predict enzymic activities, whereas 
biochemical studies actually define them. 


THE PATHWAYS OF CENTRAL 
METABOLISM IN THE ARCHAEA 


The chapter is structured in four main sections. 
The first two form the centerpiece of central metabo- 
lism: the conversion of sugars to pyruvate, and then the 
metabolic fate of pyruvate, either to organic end prod- 
ucts or to CO; by complete oxidation via the citric acid 
cycle. It is directly into these two sets of pathways that 
all nutrients essentially feed and from which biosyn- 
thesis commences. However, it is necessary and instruc- 
tive to add two further sections. First, growth on ac- 
etate is discussed as this may involve an additional 
cyclic pathway, the glyoxylate cycle. Second, the catab- 
olism of amino acids is included; while these do feed 
into the citric acid cycle, catabolism of branched-chain 
amino acids in particular deserves a special mention as 
it is in these reactions that the presence of a family of 
multienzyme complexes was discovered, which were 
until recently thought to be absent from all archaea. 

To fulfil the objective of integrating biochemical 
and genomic data, discussions and analyses are con- 
centrated on those archaea whose genomes have been 
sequenced. However, inevitably there will be crucial 
data from archaeal organisms for which the complete 
DNA sequence is not known, and reference will be 
made to these where necessary. 


The Metabolism of Monosaccharides to Pyruvate 


Catabolism of glucose 


Virtually all organisms from all three domains of 
life have the ability to metabolize glucose to pyruvate. 
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This applies regardless of whether an organism grows 
anaerobically or aerobically, heterotrophically or auto- 
trophically, or whether it employs fermentative me- 
tabolism. The conventional Embden-Meyerhof (EM) 
pathway is perhaps the most widely recognized and 
thoroughly investigated central metabolic pathway 
(48), and this conservation has led to a perceived fun- 
damental and sacrosanct nature of the pathway. How- 
ever, in reality there is considerable variability in the 
route of glucose catabolism in different organisms, 
and in 1952 a fundamentally distinct pathway for the 
breakdown of glucose to pyruvate was reported in 
Pseudomonas saccharophila (41). This Entner-Doudo- 
roff (ED) pathway was revealed by a characteristic dif- 
ference to the EM pathway in the labeling pattern of 
pyruvate when 1-!4C-glucose is metabolized (Fig. 1). 
It is now apparent that the ED pathway of glucose me- 
tabolism has a considerably wider distribution than 
was originally appreciated (23, 49, 86). Organisms 
from all three domains of life have been shown to possess 
ED-type pathways, often alongside the EM pathway. 
In the Archaea, EM and ED pathways, and variations 
and combinations thereof, represent the predominant 
routes for glucose catabolism. In certain bacteria and 
lower eucarya, other pathways, most notably the ox- 


idative pentose phosphate pathway and the phospho- 
ketolase pathway, can also contribute to glycolytic 
flux (24). 

The classical EM and ED pathways (Fig. 2) both 
begin with the phosphorylation of glucose to glucose 
6-phosphate, which may be performed by a specific 
glucokinase or a broad specificity hexokinase. Alter- 
natively, the phosphoenolpyruvate-dependent phos- 
photransferase system, which is employed by many 
bacteria for sugar uptake, phosphorylates glucose as 
it is transported into the cell. In the EM pathway, 
phosphoglucose isomerase then converts glucose 
6-phosphate to fructose 6-phosphate, which is further 
phosphorylated by phosphofructokinase to produce 
fructose 1,6-bisphosphate. Fructose-1,6-bisphosphate 
aldolase then catalyzes the cleavage of the C6 sugar, 
fructose 1,6-bisphosphate, into two C3 compounds, 
glyceraldehyde 3-phosphate and dihydroxyacetone 
phosphate. A second molecule of glyceraldehyde 3- 
phosphate is produced from dihydroxyacetone phos- 
phate via the action of triose-phosphate isomerase. 

In the classical ED pathway, the glucose 6-phos- 
phate is oxidized by glucose-6-phosphate dehydroge- 
nase to produce 6-phosphogluconate, which is then 
dehydrated by 6-phosphogluconate dehydratase to 
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Figure 1. Labeling of pyruvate during glucose catabolism. The characteristic labeling pattern of pyruvate resulting from glucose catabo- 


lism by the Embden-Meyerhof and Entner-Doudoroff pathways. 
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Figure 2. Pathways of glucose metabolism. The classical Embden-Meyerhof and Entner-Doudoroff pathways of bacteria and 
eucarya are shown in bold with each step shown connected by large full arrows, while the alternative pathways of selected ar- 
chaeal genera are displayed with various small arrows (see key). Unless specified, the cofactor usage is as shown for the classi- 
cal pathways. Enzymes are denoted by numbers: 1 = glucokinase, 2 = phosphoglucose isomerase, 3 = phosphofructokinase, 
4 = fructose-1,6-bisphosphate aldolase, § = triose-phosphate isomerase, 6 = glyceraldehyde-3-phosphate dehydrogenase, 
7 = phosphoglycerate kinase, 8 = phosphoglycerate mutase, 9 = enolase, 10 = pyruvate kinase, 11 = glyceraldehyde-3- 
phosphate ferredoxin oxidoreductase, 12 = nonphosphorylating glyceraldehyde-3-phosphate dehydrogenase, 13 = glucose- 
6-phosphate dehydrogenase, 14 = 6-phosphogluconate dehydratase, 15 = KDPG aldolase, 16 = glucose dehydrogenase, 
17 = gluconate dehydratase, 18 = KDG kinase, 19 = KDG aldolase, 20 = glyceraldehyde dehydrogenase, 21 = glycerate ki- 
nase. In Sulfolobus species it is not yet clear whether the conversion of glyceraldehyde to glycerate is catalyzed by glyceralde- 
hyde dehydrogenase (20) or glyceraldehyde oxidoreductase; see text for details. The reactions involved in the conversion of glu- 
cose, or other C6 sugars, to C3 intermediates make up the upper pathway, whereas the lower pathway refers to the conversion 
of C3 intermediates to pyruvate. 
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give 2-keto-3-deoxy-6-phosphogluconate (KDPG). 
KDPG aldolase then catalyzes the cleavage of the C6 
compound, KDPG, to produce two C3 compounds, 
glyceraldehyde 3-phosphate and pyruvate. In both 
EM and ED pathways, glyceraldehyde 3-phosphate 
is converted to pyruvate by a five-step reaction se- 
quence. It is first phosphorylated and oxidized by 
glyceraldehyde-3-phosphate dehydrogenase to give 
1,3-bisphosphoglycerate. This is then converted to 
3-phosphoglycerate by phosphoglycerate kinase, cou- 
pled to the synthesis of ATP from ADP. 3-Phospho- 
glycerate is rearranged by phosphoglycerate mutase 
to 2-phosphoglycerate, which is dehydrated by eno- 
lase to phosphoenolpyruvate. Finally, pyruvate kinase 
converts phosphoenolpyruvate to pyruvate, with the 
synthesis of another molecule of ATP from ADP. For 
each molecule of glucose, the classical EM pathway 
has an overall ATP yield of two, while the classical 
ED pathway has an overall yield of one. 

The nature of glucose catabolism in the Archaea 
is of considerable interest because of its fundamental 
importance to understanding their unique biochem- 
istry, and because of the evolutionary significance of 
the highly conserved pathways and enzymes. As such 
it has been the subject of several reviews both before 
(27, 30) and after (126, 166) the publication of ge- 
nome sequences. The ensuing text will attempt to 
provide an up-to-date report of the novel pathways, 
enzymes, and cofactors found in archaeal glucose me- 
tabolism, and compare and contrast them to the clas- 
sical pathways of eucarya and bacteria. In addition, 
differences in the metabolism of other monosaccha- 
rides, the regulation of the pathways, and gluconeo- 
genesis will be considered. Inevitably the literature 
represents a considerable bias toward research into 
the metabolism of particular sugars (especially glu- 
cose) in certain “model” organisms, in particular, 
those employing fermentative metabolism. Fortu- 
itously, the species that have been the focus of most of 
the research into archaeal metabolism represent a di- 
verse range of archaeal genera and growth environ- 
ments. The combination of genome sequence and bio- 
chemical data from these organisms has thus allowed 
a complete picture of glucose metabolism in the Ar- 
chaea to emerge. 


Halobacterium, Haloarcula, Halococcus, and 
Haloferax. Some important early work on archaeal 
glucose metabolism was performed with the extreme 
halophile, Halobacterium saccharovorum, isolated 
from a saltern in San Francisco Bay, California (159). 
Enzymological studies on cell extracts revealed that 
glucose metabolism occurs by a part-phosphorylative 
variant of the ED pathway (160). In this pathway, in- 
stead of being phosphorylated, glucose is first oxi- 


dized to gluconate by glucose dehydrogenase. A de- 
hydratase enzyme then converts gluconate to KDG, 
which is phosphorylated to KDPG by KDG kinase. 
KDPG is then metabolized as in the classical ED path- 
way to yield two molecules of pyruvate (Fig. 2). As 
with the classical ED pathway, this variant has an 
overall yield of one ATP per molecule of glucose. This 
part-phosphorylative pathway had previously been 
observed in Rhodopseudomonas sphaeroides (157) 
and several other bacterial species, including Clostrid- 
ium aceticum, where it is employed for the metabo- 
lism of gluconate (5). 

Subsequent enzymological studies have shown 
that the part-phosphorylative ED pathway also ac- 
counts for glucose metabolism in other halophilic ar- 
chaeal genera, including Haloferax, Haloarcula, and 
Halococcus (119, 144, 145). In most cases, the ED 
enzymes have been shown to be inducibly expressed 
during growth on glucose, and not produced consti- 
tutively (72, 119, 154). A '3C-glucose labeling study 
in Halococcus saccharolyticus has confirmed that glu- 
cose metabolism occurs exclusively via the part-phos- 
phorylative ED pathway in this organism (72). The 
first enzyme of the pathway, glucose dehydrogenase, 
has been purified and characterized from Haloferax 
mediterranei and shown to be a Zn?*-containing en- 
zyme with a dual specificity for NAD* and NADP* 
(11). Acomplete glycolytic EM pathway has not been 
detected in any halophilic archaeon, and it therefore 
appears that the different halophilic genera perform 
glucose catabolism exclusively via the part-phospho- 
rylative ED pathway. 

It has become clear that most halophiles can 
grow on carbohydrate energy sources, despite initial 
predictions to the contrary (119). Even species that 
are reported to be unable to use carbohydrates, such 
as Halobacterium sp. NRC-1, have been shown to 
possess the genes of the part-phosphorylative ED 
pathway. In the published genome sequence of this 
organism (105), genes have been annotated for all en- 
zymes of the part-phosphorylative ED pathway and, 
interestingly, genes for the upper pathway enzymes, 
glucose dehydrogenase, gluconate dehydratase, and 
KDPG aldolase, appear in a cluster, without the KDG 
kinase gene. There is a near-full complement of genes 
for the EM sequence, although no gene for 6-phos- 
phofructokinase has been annotated (166). A similar 
complement of genes was found subsequently to be 
present in the genome of Haloarcula marismortui, al- 
though in this organism there is no cluster of ED 
genes (8). Genes for glucose dehydrogenase, gluconate 
dehydratase, and KDG kinase from the two organ- 
isms have a high level of amino acid sequence iden- 
tity, although there is no clear KDPG aldolase or- 
tholog in H. marismortui. Additionally, the predicted 
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KDPG aldolase gene from Haloferax alicantei (65) 
does not have a clear ortholog in either genome, im- 
plying a diverse evolutionary history to this gene 
within the halophilic archaea. As yet, it is unclear 
whether halophilic organisms with a highly-restricted 
metabolic capability, such as Halosimplex carls- 
badense (170), also retain the genes and enzymes for 
the ED pathway. 

The route of glucose metabolism in halophilic 
archaea appears to contrast with the situation in 
halophilic bacteria. A recent enzymological study on 
the extremely halophilic bacterium Salinibacter ru- 
ber concluded that glucose is metabolized via the clas- 
sical ED pathway, and not the part-phosphorylative 
variant, despite the physiological closeness between 
this organism and halophilic archaea of the family 
Halobacteriaceae (108). This observation requires 
further investigation and may have interesting evolu- 
tionary implications pertaining to the lateral transfer 
of metabolic genes between the Bacteria and Archaea. 


Sulfolobus. Sulfolobus species exhibit consider- 
able metabolic diversity and versatility and are com- 
monly considered to be opportunistic heterotrophs, 
capable of utilizing a wide range of carbohydrate en- 
ergy sources (57, 131). Early work on Sulfolobus sol- 
fataricus showed that glucose was metabolized via a 
nonphosphorylative variant of the ED pathway (36). 
Glucose-labeling studies have suggested that a similar 
variant pathway accounts for metabolism in the 
closely-related organism S. acidocaldarius (142) and 
other Sulfolobus species (175). In the nonphospho- 
rylative ED pathway (Fig. 2), glucose is metabolized 
to KDG as described for the part-phosphorylative 
variant, although, in this case, KDG is cleaved directly 
to glyceraldehyde and pyruvate by KDG aldolase. 
Glyceraldehyde dehydrogenase or glyceraldehyde ox- 
idoreductase is then thought to oxidize glyceralde- 
hyde to glycerate, which is phosphorylated by glyc- 
erate kinase to give 2-phosphoglycerate. A second 
molecule of pyruvate can then be produced from this 
by the actions of enolase and pyruvate kinase, as oc- 
curs in the classical pathways. A similar nonphos- 
phorylative pathway had previously been shown to 
be responsible for the metabolism of gluconate in As- 
pergillus niger (40). Note that the nonphosphoryla- 
tive ED pathway proceeds with no net yield of ATP. 
This is in contrast to the classical and part-phospho- 
rylative routes described previously, both of which 
have an ATP yield of one mole per mole of glucose. 

Glucose dehydrogenase from S. solfataricus has 
been purified and characterized and found to be a 
highly-expressed and active enzyme, with specificity 
for NAD* and NADP*, as was observed with the 
enzyme from H. mediterranei (98). Gluconate dehy- 


dratase and KDG aldolase, which both possess (B/a) 
structures, have also been purified and characterized 
biochemically (16, 99). Enzymes of the lower path- 
way, from glyceraldehyde to 2-phosphoglycerate, 
have not been fully characterized in Sulfolobus spp., 
although a molybdenum-containing glyceraldehyde 
oxidoreductase has been discovered in S. acidocal- 
darius (78), and putative glyceraldehyde dehydroge- 
nase genes have been annotated in S. solfataricus 
(166). The enzyme activities constituting a complete 
EM pathway have not been detected in any Sulfolobus 
species, with a notable absence of phosphofructoki- 
nase activity. 

Unexpectedly, the genome sequence of S. solfa- 
taricus (146) revealed that the genes encoding glu- 
conate dehydratase and KDG aldolase were in an ED 
cluster, upstream of putative genes encoding KDG ki- 
nase and nonphosphorylating glyceraldehyde-3-phos- 
phate dehydrogenase. Subsequent biochemical analy- 
sis of the natural and recombinant enzymes confirmed 
these activities and also revealed that the KDG al- 
dolase displays KDPG aldolase activity (1). This work 
provides convincing evidence that, in fact, S. solfatar- 
icus metabolizes glucose via the part-phosphorylative 
ED pathway, which may occur alone or in parallel 
with the nonphosphorylative pathway (Fig. 2). 

The NADP-dependent, nonphosphorylating gly- 
ceraldehyde-3-phosphate dehydrogenase represents a 
bypass of the conventional glyceraldehyde-3-phos- 
phate dehydrogenase and phosphoglycerate kinase, 
and is thought to operate only in the catabolic direc- 
tion. An orthologous gene is also present in the ge- 
nome of Halobacterium sp. NRC-1 (166). Although 
the corresponding enzyme activity has not been con- 
firmed, it is possible that halophilic archaea also em- 
ploy this enzyme in the catabolic direction. The result 
of this enzyme bypass is that the modified part-phos- 
phorylative pathway that occurs in S. solfataricus has 
a net ATP yield of zero. 

It is not clear what advantage the organism 
would gain by having parallel non- and part-phos- 
phorylative pathways, or what regulates which path- 
way is favored. For twenty years or more it has been 
known that microorganisms frequently employ a va- 
riety of pathways for glucose metabolism, depending 
on growth conditions and substrates (110). A power- 
ful example is Thiobacillus A2, in which ED, EM, 
and oxidative pentose-phosphate pathways constitute 
varying proportions of glycolytic flux, depending on 
culture conditions (174). One could speculate that the 
presence of two variant ED pathways may give S. sol- 
fataricus more metabolic versatility, providing a 
greater opportunity to generate intermediates for bio- 
synthesis. For example, during glycolytic growth con- 
ditions there may be a requirement for glyceraldehyde 
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3-phosphate, which is used in the production of de- 
oxyribose 5-phosphate for nucleoside biosynthesis 
(116). It should also be considered that the nonphos- 
phorylative pathway permits the conversion of glu- 
cose to pyruvate in only three steps, without any ATP 
input, which may permit the supply of carbon to the 
energy-generating citric acid cycle under conditions of 
low ATP. 

In contrast to halophilic archaea, the ED en- 
zymes in S. solfataricus appear to be constitutively ex- 
pressed. Specific activities of glucose dehydrogenase, 
gluconate dehydratase, and KDG aldolase have been 
shown to be unaffected by culture age or growth 
substrate (glucose or yeast extract) (36). Sulfolobus 
species have been shown to maintain an energy store 
in the form of glycogen (94), which may explain the 
presence of the ED pathway enzymes even during 
growth on noncarbohydrate food sources. 


Thermoplasma. Thermoplasma species are ther- 
moacidophilic, facultatively anaerobic, obligate het- 
erotrophs with growth requirements of 33 to 67°C 
and pH 0.5 to 4 (141). Enzymological studies on one 
model organism, Thermoplasma acidophilum (32), 
have revealed that glucose metabolism occurs via the 
nonphosphorylative ED pathway (17) (Fig. 2), as re- 
ported in Sulfolobus species. However, in T. aci- 
dophilum the activities of the lower pathway have 
been confirmed enzymologically. The first enzyme of 
the pathway, glucose dehydrogenase, has been com- 
prehensively characterized and was found to exhibit 
dual cofactor specificity for NAD and NADP (151). 
The complete genome sequence of T. acidophilum 
(129) revealed the expected complement of genes for 
the nonphosphorylative ED pathway, while key activ- 
ities and genes of the EM pathway, notably phospho- 
fructokinase, have not been found. A similar distrib- 
ution of homologous genes has also been found in the 
genome of the closely related organism Thermo- 
plasma volcanium (81). No ortholog of the S. solfa- 
taricus KDG kinase gene can be identified in the 
genomes of Thermoplasma species, so that the pres- 
ence of the part-phosphorylative ED pathway in 
Thermoplasma remains an open question. 

Recently, the genome sequence of Picrophilus tor- 
ridus has revealed a similar profile of metabolic genes 
to Thermoplasma species (54). This organism, and the 
closely related species Picrophilus oshimae, exhibits a 
close physiological relationship to Thermoplasma, 
and it is likely that they metabolize glucose in the same 
way. The first enzyme of the nonphosphorylative ED 
pathway, glucose dehydrogenase, has been character- 
ized from P. torridus, and it exhibits properties similar 
to the enzyme from T. acidophilum (6). 


Pyrococcus, Thermococcus, Methanococcus, and 
Other Hyperthermophiles. Pyrococcus species are 
hyperthermophilic members of the Euryarchaeota, 
with typical growth requirements of 70 to 100°C, 0.5 
to 5% NaCl and pH 5 to 9. They are strictly anaero- 
bic heterotrophs and can be cultured on minimal salt 
media with a variety of carbon sources, such as mal- 
tose and starch. The most researched model organism 
is Pyrococcus furiosus (45). Extensive investigation 
by !%C-labeling and enzymological studies has re- 
vealed glucose metabolism to proceed by a modified 
EM pathway (Fig. 2), with several critical differences 
to the classical pathway (83, 135). Although virtu- 
ally the same sequence of chemical transformations 
is performed to convert glucose to pyruvate, there are 
notable differences in the cofactor usage in this or- 
ganism, and many of the enzymes are not evolution- 
arily related to known bacterial or eucaryal enzymes. 
The first step is catalyzed by an ADP-dependent 
(AMP-forming) glucokinase, which is induced by 
growth on sugar substrates (85). An unusual phos- 
phoglucose isomerase, a member of the cupin super- 
family, then catalyzes the isomerization of glucose 
6-phosphate to fructose 6-phosphate (58, 165). Fruc- 
tose 6-phosphate phosphorylation is catalyzed by 
an ADP-dependent (AMP-forming) phosphofructo- 
kinase (161), which was found to be a member of a 
novel, evolutionarily distinct family of phosphofruc- 
tokinases (family C) (125). As observed in the part- 
phosphorylative ED pathway of S. solfataricus, glyc- 
eraldehyde 3-phosphate is converted directly to 
3-phosphoglycerate, bypassing the activities of glyc- 
eraldehyde-3-phosphate dehydrogenase and phos- 
phoglycerate kinase. However, in P. furiosus this 
reaction is catalyzed by a unique, inducible, ferre- 
doxin-dependent glyceraldehyde-3-phosphate oxi- 
doreductase (164). Phosphoglycerate mutase has also 
been characterized and was again found to represent 
a new protein family, unrelated to bacterial and eu- 
caryal enzymes (163). It has been shown that the gly- 
colytic route in P. furiosus has a net yield of one mole 
of ATP per mole of glucose (83). 

The complete genome sequence of P. furiosus has 
revealed genes for all the expected EM enzymes and 
a notable absence of ED enzyme genes (100). In ad- 
dition, the genomes of Pyrococcus horikoshii (79) 
and Pyrococcus abyssi (22) have been sequenced and 
show a similar distribution of orthologous glycolytic 
enzyme genes. This provides good evidence that the 
sole route of glucose catabolism in the genus Pyro- 
coccus is via the same modified EM pathway. 

In addition to Pyrococcus, some research has 
also been performed on glucose catabolism in other 
organisms of the order Thermococcales. Notably, 
there is good evidence that species of Thermococcus 
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employ a glycolytic pathway very similar to that of 
Pyrococcus. For example, the complete pathway has 
been characterized in Thermococcus celer and Ther- 
mococcus litoralis by enzyme assays and labeling 
studies (142). In addition, an ADP-dependent glucok- 
inase has been characterized in T. litoralis (92), and 
an ADP-dependent phosphofructokinase has been 
characterized in T. zilligii (124), with both enzymes 
possessing similar properties to the enzymes from 
P. furiosus. The genome sequence of Thermococcus 
kodakaraensis (53) revealed orthologous sequences 
for all the expected enzymes of the modified EM 
pathway. 

Species of Methanococcus also possess an EM 
glycolytic pathway similar to that in Pyrococcus. The 
most researched model organism is the strictly-anaer- 
obic, hyperthermophilic autotroph, Methanococcus 
jannaschii (75). An ADP-dependent phosphofructo- 
kinase has been characterized in this organism and 
was also found to possess glucokinase activity (130, 
165). Additionally, highly divergent phosphoglucose 
isomerase (127) and phosphoglycerate mutase (56) 
genes have been discovered. The published genomic 
sequence revealed all the expected orthologs of EM 
pathway enzymes (18) and an absence of ED pathway 
genes. A comprehensive study of glycolytic enzyme 
activities has been performed on the closely related 
organism Methanococcus maripaludis (177). En- 
zymes of the EM pathway were detected, while en- 
zymes of the ED and oxidative pentose phosphate 
pathways could not be found. 

Given the strictly autotrophic nature of Meth- 
anococcus spp., it may seem surprising that they pos- 
sess a pathway for glucose catabolism. However, 
several methanogenic genera are known to contain 
glycogen stores (93) that are metabolized by the mod- 
ified EM pathway. A comprehensive investigation of 
phosphofructokinase in methanogenic archaea has re- 
vealed that a number of glycogen-containing organ- 
isms, both mesophiles and thermophiles from the 
genera Methanococcus and Methanosarcina, possess 
the activity (168). Organisms that do not possess 
glycogen stores, such as Methanothermobacter therm- 
autotrophicus, were found not to possess any ADP- 
or ATP-dependent phosphofructokinase activity, 
which suggests they do not contain a complete gly- 
colytic pathway. The inability to find key glycolytic 
genes in the genome sequence of M. thermautotrophi- 
cus provides further support for this assertion (150, 
166). However, it has been documented that 1-!3C- 
glucose and 6-!3C-glucose are metabolized by cells 
of the organism to form exclusively 3-!%C-2,3-cy- 
clopyrophosphoglycerate, convincingly demonstrat- 
ing that an EM glycolytic pathway is present (43, 44). 
This discrepancy could be explained by a novel path- 


way of glucose metabolism, or evolutionarily-distant 
enzymes of the EM pathway, possibly with unusual 
cofactor dependence. 

Several other hyperthermophilic genera have a 
similar glycolytic route to those described above, with 
only minor variations. For example, the hyperther- 
mophilic, sulfate-reducer Archaeoglobus fulgidus 
strain 7324 has been shown by enzymological stud- 
ies to possess all the required activities of the modified 
EM pathway (96), and ADP-dependent kinases for 
glucose (97) and fructose 6-phosphate (60) have been 
characterized. However, orthologous genes could not 
be found in the genome sequence of strain VC16, and 
so it is unclear whether the EM pathway is employed 
universally in this genus (89). The hyperthermophilic 
anaerobe Desulfurococcus amylolyticus has been 
shown by labeling and enzymological studies to use 
a similar EM pathway for glucose catabolism (142), 
although, in this organism, ATP-dependent kinases 
were found for the phosphorylation of glucose and 
fructose 6-phosphate. The aerobic, hyperthermophilic 
crenarchaeon Aeropyrum pernix also appears to em- 
ploy a modified EM pathway with ATP-dependent ki- 
nases. The ATP-dependent glucose phosphorylation is 
catalyzed by a broad-specificity hexokinase, with sim- 
ilarity to the ROK group of bacterial sugar kinases 
(59). The conversion of glucose 6-phosphate to fruc- 
tose 6-phosphate is performed by an unusual bifunc- 
tional phosphoglucose/phosphomannose isomerase 
(61). The ATP-dependent phosphofructokinase is a 
family B enzyme, otherwise only found in certain en- 
terobacteria, which contrasts with the family C ADP- 
dependent enzymes of most hyperthermophilic ar- 
chaea (123). The published genome sequence of this 
organism revealed the expected EM pathway genes 
and suggested that a nonphosphorylating glyceralde- 
hyde-3-phosphate dehydrogenase is employed instead 
of a ferredoxin-dependent enzyme (80). 

An important comparison should be made be- 
tween metabolism in hyperthermophilic archaea and 
the pathways present in hyperthermophilic bacteria. 
Metabolism has been investigated in the hyperther- 
mophilic bacterium Thermotoga maritima, and under 
the conditions used, glucose was metabolized by both 
the classical EM pathway and the classical ED path- 
way (142). None of the archaeal modifications to the 
EM pathway was found, such as a nonphosphorylat- 
ing glyceraldehyde-3-phosphosphate dehydrogenase 
or kinases with unusual cofactor specificity. Further- 
more, no non- or part-phosphorylative variants of the 
ED pathway have been detected. One interesting 
novel feature is the presence of a fusion enzyme with 
phosphoglycerate kinase and triose-phosphate iso- 
merase activities (138). The published genome se- 
quence supports these observations as it contains the 
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expected profile of genes that encode the enzymes of 
the classical glycolytic pathways (104). 


Thermoproteus. One other model organism that 
should be considered to give a complete overview of 
archaeal glucose metabolism is the anaerobic, hyper- 
thermophilic crenarchaeaon Thermoproteus tenax, 
which is capable of autotrophic and heterotrophic 
growth. The pathways and enzymes present have 
been pieced together by a combination of genomic, 
enzymological, and microbiological techniques (149). 
Remarkably, this organism appears to contain all 
three variants of glucose metabolic pathways found 
in archaea; it can use the nonphosphorylative and 
part-phosphorylative ED pathways and its own vari- 
ant of the EM pathway (Fig. 2). Notably, EM metab- 
olism in T. tenax involves a number of distinct enzymes 
to those in P. furiosus, including a broad-specificity 
ATP-dependent hexokinase of the ROK group (38), 
a family A pyrophosphate-dependent phosphofruc- 
tokinase (148), and an NAD-dependent, nonphos- 
phorylating glyceraldehyde-3-phosphate dehydroge- 
nase (14). A novel class I fructose-1,6-bisphosphate 
aldolase was also discovered in this organism (147), 
and orthologous sequences were subsequently identi- 
fied in virtually all archaeal genomes (166). This en- 
zyme contrasts with the class II enzyme found to be 
present in thermophilic bacteria, such as Thermus 
aquaticus (34). As in Sulfolobus, a single aldolase has 
been shown to be responsible for the cleavage of 
KDPG and KDG, and it has been suggested that the 
part-phosphorylative and nonphosphorylative ED 
pathways may function in parallel (149). It has been 
reported that cells grown with glucose and yeast ex- 
tract metabolize 80 to 90% of glucose via the modi- 
fied EM pathway and the remainder via the modified 
ED pathway(s) (142). The physiological implications 
of the parallel EM and ED pathways in T. tenax are 
not clear but are likely to be affected by the culture 
conditions employed. 


Catabolism of other sugars 


The vast majority of work on archaeal metabo- 
lism has focused on investigating the pathways of glu- 
cose metabolism, and to date there has been little re- 
search into how other hexose or pentose sugars enter 
central metabolism. This is a pertinent question given 
the ability of many saccharolytic archaea to metabo- 
lize alternative carbohydrates. 

One example that is well documented is the me- 
tabolism of fructose in halophilic archaea. Work on 
Haloarcula vallismortis (3) and Halococcus saccha- 
rolyticus (72) has shown that this sugar enters the EM 


pathway via ketohexokinase-catalyzed, ATP-depen- 
dent phosphorylation to form fructose 1-phosphate; 
subsequent phosphorylation by 1-phosphofructo- 
kinase produces fructose 1,6-bisphosphate. The pres- 
ence of class I and II fructose-1,6-bisphosphate al- 
dolases has been documented in halophilic archaea, 
and they catalyze the next step in fructose metabolism 
(37). Haloarcula vallismortis and H. mediterranei 
have also been shown to catabolize sucrose and man- 
nitol via this route after initial conversion to fructose 
(4). It is possible that this route is used for fructose ca- 
tabolism in other archaea, such as Sulfolobus species, 
which have been shown to grow on sucrose as the 
sole carbon and energy source (57). A similar route of 
fructose metabolism via the EM pathway exists in 
many bacteria, although in this case, the initial phos- 
phorylation of exogenous fructose accompanies up- 
take by the phosphotransferase transporter (24). 

In S. solfataricus, it has been discovered that 
galactose, the C-4 epimer of glucose, is metabolized by 
the same nonphosphorylative ED pathway enzymes 
that perform glucose catabolism (98, 99). The three 
enzymes of the upper pathway, glucose dehydroge- 
nase, gluconate dehydratase, and KDG aldolase, were 
found to have the necessary substrate promiscuity to 
permit their activity with substrates displaying either 
configuration at C-4. Similarly, enzymes of the part- 
phosphorylative ED pathway, which may occur in 
parallel with the nonphosphorylative variant in S. sol- 
fataricus, have also recently been found to have the 
same substrate promiscuity for the metabolism of 
both sugars (H. J. Lamble, D. W. Hough, and M. J. 
Danson, unpublished observations). 

The catabolism of galactose by both part-phos- 
phorylative (33) and nonphosphorylative (39) vari- 
ants of the ED pathway has been documented in 
other organisms, although in these cases the galactose 
catabolic pathway consists of separate, inducible en- 
zymes. It has been suggested that the ‘metabolic path- 
way promiscuity’ observed in Sulfolobus may also 
exist in other archaeal genera such as Thermoplasma, 
Picrophilus, Haloferax, and Thermoproteus (54, 98). 
However, it has recently been found that specific ED 
enzymes are induced in Haloferax volcanii when the 
organism is cultured in galactose-containing media 
(Lamble, Hough and Danson, unpublished), implying 
that a separate ED pathway is used for galactose ca- 
tabolism in this organism. Additionally, it is reported 
that the gluconate dehydratase of the ED pathway in 
Thermoproteus tenax does not have activity with 
galactonate (1), although it is still possible that other 
enzymes of the pathway are employed for the metab- 
olism of both sugars in this organism. 

In P. furiosus the first step of galactose catabo- 
lism appears to be phosphorylation at C-1, and a spe- 
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cific galactokinase has been characterized (167). This 
suggests that galactose is metabolized via the Leloir 
pathway, in which galactose 1-phosphate is converted 
to glucose 1-phosphate by means of uridine nu- 
cleotide intermediates (64). It is not yet clear whether 
any archaea use the tagatose 6-phosphate pathway 
for galactose catabolism; this pathway performs the 
same series of chemical transformations as the EM 
pathway, but with enzymes that are specific for the al- 
ternative configuration at C-4. 

There may be other examples of “promiscuous” 
central metabolic pathways in the Archaea. For 
example, the modified EM pathways reported in 
A. pernix, Pyrobaculum aerophilum, and T. tenax 
seem to be promiscuous for the metabolism of man- 
nose, the C-2 epimer of glucose. The broad-specificity 
hexokinases found in these organisms have activity 
with glucose and mannose, phosphorylating both 
sugars at C-6. Each of these organisms also contains 
a bifunctional phosphoglucose/phosphomannose iso- 
merase, which converts both compounds to fructose 
6-phosphate. This situation contrasts with bacteria 
and eucarya, which employ a specific phosphoman- 
nose isomerase (61). 

A question remains over the entry point of several 
other sugars into central metabolism in the Archaea, 
and will no doubt be addressed by future research. 
One important area to investigate is how pentose sug- 
ars can enter central metabolism. Some pentoses, 
such as D-xylose and L-arabinose, are known to sup- 
port the growth of certain archaea, although to date 
there has been little investigation into how they are 
metabolized. One recent report has described how 
a specific D-xylose dehydrogenase is induced in H. 
marismortui during growth on this sugar (71), and it 
therefore seems likely that the oxidation of D-xylose 
to D-xylonate is the first step in its metabolism. The 
glucose dehydrogenase from both T. acidophilum and 
S. solfataricus has been found to use D-xylose as a 
substrate (Lamble, Hough and Danson, unpublished). 
This may imply that the glucose dehydrogenase from 
the “promiscuous” ED pathway in these organisms 
also acts as the first step in D-xylose catabolism. This 
proposed reaction in pentose catabolism contrasts 
with the situation in bacteria, where D-xylose, L-ara- 
binose, and D-ribose are commonly metabolized via 
the pentose phosphate pathway. 


The pentose phosphate pathway 


The text above has catalogued the different vari- 
ations to the classical EM and ED pathways that have 
been documented in model archaeal organisms. To 
date there is little substantive evidence that any other 
pathways are involved in glucose catabolism in the 


Archaea. This situation contrasts with glucose me- 
tabolism in bacteria and lower eucarya, where the 
pentose phosphate pathway (115) can also contribute 
to the glycolytic flux. The pentose phosphate path- 
way has an oxidative part whereby glucose 6-phos- 
phate is converted to ribulose 5-phosphate with the 
release of CO. It also has a nonoxidative part in- 
volving transaldolase and transketolase, into which 
the EM pathway intermediates glyceraldehyde 3- 
phosphate and fructose 6-phosphate can enter. An ex- 
tension to the pathway, the phosphoketolase path- 
way, involves the action of phosphoketolases on 
intermediates of the nonoxidative part and is found in 
some lactic acid bacteria (76). When employed in glu- 
cose catabolism, the different variations result in 
characteristic labeling patterns, as described for the 
EM and ED pathways (Fig. 1), which permit their 
relative contributions to be assessed. In reality, these 
pathways rarely contribute significantly to glycolytic 
flux and are most commonly employed for tetrose 
and pentose biosynthesis (24). 

A comprehensive phylogenetic analysis of the 
genes for enzymes of the pentose phosphate pathway 
in the Archaea has been performed (153). M. jan- 
naschii, T. acidophilum, and T. volcanium are reported 
to possess a complete nonoxidative pentose phosphate 
pathway that is employed for the synthesis of ribose 
5-phosphate. Other organisms are predicted to use the 
ribulose monophosphate pathway for pentose biosyn- 
thesis. There is no evidence for a complete oxidative 
pathway in any sequenced archaeal genome, and genes 
for glucose-6-phosphate dehydrogenase or 6-phospho- 
gluconate dehydrogenase have not been found. How- 
ever, it should be considered that the sequences may 
be too distantly related to known enzymes to be de- 
tected by comparative genomics. Certain species that 
do not have a complete nonoxidative pentose phos- 
phate pathway, such as A. pernix and S. solfataricus, 
do still possess a gene for transketolase, which is pre- 
dicted to function in the synthesis of erythrose 4-phos- 
phate for the biosynthesis of aromatic amino acids. 

A novel pathway of glucose metabolism may op- 
erate alongside the EM route in Thermococcus zil- 
ligii (176). A labeling pattern was observed that is 
consistent with the pentose phosphoketolase path- 
way, and it is proposed that glucose is linked to this 
pathway by first being converted to formate and pen- 
tose phosphate. This novel pathway requires confir- 
mation by enzyme characterization but represents the 
best demonstration so far of an alternative to the EM 
or ED pathway, in any archaeon. Another suggestion 
that an alternative pathway may operate in archaea 
came from the labeling pattern resulting from glucose 
metabolism by autotrophically grown strains of Sul- 
folobus (175), and this work should be reconsidered 
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in the light of the novel pathway proposed in T. zil- 
ligii. It is possible that, as a wider range of organisms 
is investigated under a variety of growth conditions, a 
greater contribution of alternative and/or novel gly- 
colytic pathways may be revealed. 


Gluconeogenesis 


Gluconeogeneis is essential for the generation of 
biosynthetic intermediates and polysaccharide energy 
stores and is commonly active during growth on en- 
ergy sources other than hexoses. The pathway of glu- 
coneogenesis is ubiquitous in all three phylogenetic 
domains and occurs by a reversal of the chemical 
transformations of the classical EM pathway (Fig. 3). 
Two critical irreversible enzymatic steps of this path- 
way involve alternative enzymes in archaeal gluco- 
neogenesis—the reactions catalyzed by pyruvate ki- 
nase and phosphofructokinase. In addition, those 
archaea that use a nonphosphorylating glyceralde- 
hyde-3-phosphate dehydrogenase or oxidoreductase 
employ the traditional enzyme reactions of phospho- 
glycerate kinase and glyceraldehyde-3-phosphate de- 
hydrogenase in the gluconeogenic direction. 

To bypass the pyruvate kinase reaction during 
gluconeogenesis, a number of possible enzymes have 
been reported. Most importantly, phosphoenolpyru- 
vate (PEP) synthase homologs are present in all se- 
quenced archaeal genomes and catalyze PEP forma- 
tion from pyruvate, coupled with the conversion of 
ATP to AMP + P.. In addition, several archaea have 
been found to contain a predicted gene for PEP car- 
boxykinase; this enzyme has been characterized in 
Thermococcus kodakaraensis (51) and catalyzes the 
GTP-dependent conversion of oxaloacetate to PEP, 
coupled to the release of CO. A gene for malic en- 
zyme has also been characterized in T. kodakaraensis 
(52), and orthologous genes are present in several 
other archaea. This enzyme catalyzes the conversion 
of malate into pyruvate, coupled to the reduction of 
NADP?* and the release of CO3. In T. tenax a further 
alternative has been assayed; the enzyme pyruvate 
phosphate dikinase catalyzes the reversible intercon- 
version of pyruvate and PEP, coupled to the forma- 
tion of AMP + PP; from P; + ATP. It is suggested that 
this enzyme may operate in both glycolytic and glu- 
coneogenic pathways (149), although its physiologi- 
cal significance and distribution have not yet been es- 
tablished. Which enzyme is employed as the entry 
point of gluconeogenesis depends on the particular 
growth substrate an organism is utilizing, and is in- 
trinsically connected to how the citric acid cycle is 
functioning (see “The citric acid cycle,” below). 

The other step that requires an alternative en- 
zyme in archaeal gluconeogenesis is the reaction cat- 
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Figure 3. Gluconeogenesis. The reactions of the gluconeogenic 
pathway. Enzymes are denoted by numbers: 1 = phospho- 
enolpyruvate synthase, 2 = enolase, 3 = phosphoglycerate mutase, 
4 = phosphoglycerate kinase, 5 = glyceraldehyde-3-phosphate de- 
hydrogenase, 6 = triose-phosphate isomerase, 7 = fructose-1,6- 
bisphosphate aldolase, 8 = fructose-1,6-bisphosphatase, 9 = phos- 
phoglucose isomerase. 


270 DANSON ET AL. 


alyzed by phosphofructokinase, the reverse reaction 
being performed by fructose-1,6-bisphosphatase. Of 
the sequenced archaeal genomes, only that of Halo- 
bacterium NRC-1 was found to contain a canonical 
fructose-1,6-bisphosphatase (type I). However, a novel 
bifunctional inositol-1-phosphatase/fructose-1,6-bis- 
phosphatase (type IV) was subsequently found in 
M. jannaschii (155), and orthologs of this are present 
in several other archaeal genomes. Yet another fruc- 
tose-1,6-bisphosphatase (type V) has been character- 
ized in the genome of T. kodakaraensis, with orthologs 
in virtually all archaeal genome sequences (117). 

The other enzyme steps of the glycolytic pathway 
are readily reversible and permit archaea to form gly- 
colytic intermediates for biosynthesis. The pathway 
of gluconeogenesis has been characterized by enzy- 
mological and labeling studies in several organisms. 
For example, in the autotroph Methanothermobac- 
ter thermautotrophicus, it has been demonstrated 
that labeled carbon from !*COQj is incorporated into 
fructose phosphate and glucose phosphate (69), and 
in P. furiosus, enzymological studies have confirmed 
all the required enzyme activities for conversion of 
pyruvate to glucose 6-phosphate (133). The genes re- 
quired for gluconeogenesis have been annotated in all 
the sequenced archaeal genomes, although a gene for 
phosphoglucose isomerase could not be identified in 
the genomes of M. thermautotrophicus or A. fulgi- 
dus (166). However, the latter analysis contradicts 
biochemical studies, which suggest that phosphoglu- 
cose isomerase activity is in fact present in strains of 
both organisms (43, 96). It therefore seems that a 
complete gluconeogenic pathway to glucose 6-phos- 
phate is present in all archaeal genera. In certain or- 
ganisms, glyceraldehyde 3-phosphate and fructose 
6-phosphate may be used directly for biosynthesis via 
the nonoxidative pentose phosphate pathway. In oth- 
ers, such as Sulfolobus, Thermococcus, and Methano- 
coccus, glucose 6-phosphate may be used for the syn- 
thesis of glycogen. 

Conventionally, glycogen is synthesized by con- 
version of glucose 6-phosphate to glucose 1-phos- 
phate and its subsequent conversion to UDP-glucose. 
Glycogen synthase then condenses this compound to 
the growing polymer chain, with the release of UDP. 
There has been little characterization of the enzymes 
of glycogen metabolism in archaea, although glyco- 
gen synthase has been characterized in S. acidocal- 
darius (19, 94). The putative genes of glycogen syn- 
thesis are found in a cluster in the S. solfataricus 
genome, alongside putative genes for glycogen- 
degrading enzymes. A similar cluster has also been 
described in the genome sequence of T. tenax (149). 
Further biochemical characterization is required to 
establish the enzymes and regulation of glycogen syn- 
thesis and breakdown in the Archaea. 


The ubiquity of the gluconeogenic pathway at- 
taches possible evolutionary significance to this path- 
way. While many alternative glycolytic pathways 
have been documented, the gluconeogenic route is 
conserved throughout all three domains of life. In ad- 
dition, it is striking that the enzymes of the lower 
pathway (pyruvate to glyceraldehyde 3-phosphate) 
are significantly more conserved than the upper path- 
way enzymes, which have been found to belong to 
several different sequence families. Indeed, a detailed 
analysis of the distribution and phylogenies of gly- 
colytic and gluconeogenic enzymes has led to the 
proposal that the gluconeogenic pathway is evolu- 
tionarily more ancient (126). 


Regulation of glucose metabolism 


Given the fundamental metabolic role of the in- 
terconversion of glucose and pyruvate in all three do- 
mains of life, the regulation of the enzymes and path- 
ways involved is of critical importance. There has 
been little investigation into the precise nature of the 
mechanisms that underpin the regulation of glycoly- 
sis and gluconeogenesis in the Archaea. However, re- 
cent work on T. tenax and P. furiosus has begun to 
elucidate the control of the pathways in these organ- 
isms. The fact that a nonphosphorylating glyceralde- 
hyde-3-phosphate dehydrogenase or oxidoreductase is 
employed solely in the glycolytic direction in many 
archaea provides a novel regulatory control point. In 
P. furiosus, regulation occurs primarily at the tran- 
script level, and a strong upregulation of glyceralde- 
hyde-3-phosphate oxidoreductase is found during 
growth on cellobiose compared with growth on pyru- 
vate (164). In T. tenax, the nonphosphorylating glyc- 
eraldehyde-3-phosphate dehydrogenase operates un- 
der allosteric control; the enzyme is inhibited by 
NADPH, NADP*, NADH, and ATP and is activated 
by AMP, glucose 1-phosphate, fructose 6-phosphate, 
ADP, and ribose 5-phosphate (15). This allosteric con- 
trol ensures that the enzyme is most active when in- 
tracellular conditions require a higher level of glyco- 
lytic flux. The presence of orthologs of the T. tenax 
enzyme in several other archaea suggests that this 
strategy may also be employed in other organisms. 
Taken together with evidence of transcript level regu- 
lation, it appears that the interconversion of glycer- 
aldehyde 3-phosphate and 3-phosphoglycerate is the 
critical control point in archaeal glycolysis/gluconeo- 
genesis. This contrasts with the situation in bacteria 
and eucarya, where phosphofructokinase and pyru- 
vate kinase are the critical regulatory control points, 
operating under strict transcript and allosteric control. 

In P. furiosus, a recent whole-genome microarray 
analysis (139) has revealed that the glycolytic en- 
zymes phosphoglucose isomerase, phosphofructoki- 
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nase, and triose phosphate isomerase are also upreg- 
ulated during growth on carbohydrates. In addition, 
the gluconeogenic enzymes glyceraldehyde-3-phos- 
phate dehydrogenase, phosphoglycerate kinase, and 
fructose-1,6-bisphosphate aldolase are upregulated 
during growth on peptides. A similar study with 
H. volcanii revealed an upregulation of several ED 
pathway genes when growth substrate was switched 
from amino acids to glucose (178). In T. tenax, there 
is evidence that pyruvate kinase is upregulated at the 
transcript level during heterotrophic growth (137). 
There is no evidence for allosteric control of this en- 
zyme in the Archaea, other than positive cooperativ- 
ity with its substrates PEP and ADP (70). 

In certain organisms, such as Thermoplasma 
spp., glycolysis and gluconeogenesis are performed by 
separate pathways. This may simplify the regulatory 
requirements in these organisms, as fewer enzymes 
are required to function in both catabolic and ana- 
bolic directions. Furthermore, the existence of “pro- 
miscuous” metabolic pathways in certain archaea 
may simplify the regulatory requirements during 
growth on different energy substrates. Convention- 
ally, growth on an alternative sugar requires an or- 
ganism to activate the transcription and expression of 
new enzymes and pathways, via complex regulatory 
mechanisms. If the same pathway of enzymes is used 
for more than one sugar, then this is not required and 
may permit an organism to adjust more efficiently to 
alternative energy sources. 

Another process that may play a role in the reg- 
ulation of archaeal glycolysis/gluconeogenesis is phos- 
phorylation, and it has recently been suggested that 
certain metabolic enzymes in S. solfataricus are sub- 
ject to control by a phosphorylation-dephosphoryla- 
tion mechanism (88, 120) (see Chapter 11). However, 
the precise nature of this mechanism, and its contri- 
bution to the regulation of archaeal carbohydrate me- 
tabolism, remains to be elucidated. 


The Metabolic Fate of Pyruvate 


Oxidation to acetyl-CoA via pyruvate 
oxidoreductase 


In contrast to the diversity of catabolic pathways 
leading from hexoses to pyruvate, there is a distinct 
unity in the manner in which the Archaea convert 
pyruvate to acetyl-coenzyme A (acetyl-CoA), namely 
via a pyruvate ferredoxin (Fd) oxidoreductase (POR) 
(reviewed in reference 140). This enzyme is active in 
all archaea, whether aerobic or anaerobic, and cat- 
alyzes the oxidative decarboxylation of pyruvate: 


Pyruvate + CoASH + Fdox — acetyl-CoA 
+ CO, + Fdreg 


The decarboxylation reaction is a thiamine py- 
rophosphate (TPP)-dependent process, and the acetyl 
group is handed directly to CoA. FeS centers in the 
enzyme serve to direct electron flow to the electron 
acceptor, which is ferredoxin in all archaea so far in- 
vestigated. The cell ultimately disposes of the elec- 
trons as H2, H3S, or an organic acid. 

Many archaeal PORs (e.g., from P. furiosus, 
A. fulgidus, and M. thermautotrophicus) are oc- 
tameric in nature (a2 77252), with whole M, values 
at about 240,000 (90, 95, 158). The POR from Halo- 
bacterium halobium is an a8>-tetramer (113), whereas 
heterodimeric enzymes have been characterized from 
S. solfataricus (179) and A. pernix (106). Sequence 
analyses of the four-subunit PORs indicate that the B- 
subunit contains a TPP-binding motif and four con- 
served cysteines that might bind an [4Fe-4S] cluster, 
and that the 8-subunit contains two conserved [4Fe- 
4S] cluster-binding motifs (140). By combining these 
data with electron paramagnetic resonance studies on 
the P. furiosus holoenzyme and the 6-component, a 
mechanistic model is proposed whereby the oxidative 
decarboxylation of pyruvate to acetyl-CoA is cat- 
alyzed by the B-subunit, and electron flow from the 
2-oxo acid to ferredoxin is via the 5-protein (Fig. 4). 

In the Archaea, the POR is one member of a fam- 
ily of enzymes that also includes 2-oxoglutarate (a- 
ketoglutarate; KGOR) and branched-chain 2-oxoacid 
(BCOR) oxidoreductases; all three are shown to be 
homologous by sequence comparisons (140). Fur- 
thermore, in P. furiosus, for example, the 5-, a-, and 
B-genes of POR lie in a cluster close to a similar clus- 
ter for the BCOR, with the y-subunit of the two en- 
zymes being encoded by a single upstream gene. Not 
all OR enzymes are substrate specific. For example, 
A. pernix has two sets of genes encoding ORs that 
both utilize pyruvate, but only one of which can effi- 
ciently accept 2-oxoglutarate as a substrate; inter- 
estingly, neither enzyme can use the branched-chain 
2-oxoacids (106). S. solfataricus also has an OR 
that oxidatively decarboxylates both pyruvate and 
2-oxoglutarate (50). 

Finally, note that the oxidative decarboxylation 
of pyruvate to acetyl-CoA can be reversed using re- 
duced ferredoxin and the oxidoreductase. In combi- 
nation with PEP synthase, this is important for glu- 
coneogensis to function during growth on acetate and 
for those archaea that can fix CO, via a reductive cit- 
ric acid cycle; both these processes are considered in 
more detail below (see “The citric acid cycle” and 
“Growth on acetate and the glyoxylate cycle”). 


The production of acetate 


Following the decarboxylation of pyruvate, 
many archaea are able to convert acetyl-CoA to ac- 
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Figure 4. Pyruvate ferredoxin oxidoreductase. Schematic representation of the four-subunit (aßyò) pyruvate ferredoxin oxi- 
doreductase and the proposed pathway of electron flow (adapted from reference 140). Ferredoxin (Fd) is the electron accep- 


tor. CoA, coenzyme A; TPP, thiamine pyrophosphate; [4Fe-4S], iron sulfur cluster. 


etate with the concomitant production of ATP. In- 
deed, in anaerobic hyperthermophilic archaea such as 
P. furiosus, this reaction represents the major energy- 
conserving step in the fermentation of sugars and 
pyruvate (136). A unique feature of the Archaea is 
that they possess an ADP-dependent acetyl-CoA syn- 
thetase (ACD) to catalyze this production of ATP in 
a one-step conversion: 


Acetyl-CoA + ADP + Pi @ acetate + CoA + ATP 


This contrasts with the situation in all bacteria, which 
use a two-step mechanism involving phosphate acetyl- 
transferase and acetate kinase, with acetylphosphate 
being generated as a metabolic intermediate. 


Archaeal ACD was first detected in T. acido- 
philum (26) and was subsequently characterized in 
a variety of halophilic, hyperthermophilic, and meth- 
anogenic organisms (134). The oligomeric nature of 
these ACDs varies between the a285-type in Pyrococ- 
cus (55, 101) and Pyrobaculum (12), and homolo- 
gous homodimers representing gene and polypeptide 
aB-fusions in Haloarcula (12), Archaeoglobus, and 
Methanococcus species (103). The different enzymes 
have differing substrate specificities, with the ACD 
from P. aerophilum, for example, utilizing acetyl- 
CoA, isobutyryl-CoA, and phenylacetyl-CoA. Thus 
ACDs play a role in the metabolism of both aliphatic 
and aromatic amino acids, as well as energy genera- 
tion from the catabolism of sugars. 
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The production of alanine 


Many anaerobic archaea produce alanine in ad- 
dition to acetate while fermenting sugars and other 
nutrients. Kengen and Stams (84) found that P. furiosus 
produced substantial amounts of L-alanine during 
batch growth on cellobiose, maltose, or pyruvate; ra- 
tios of alanine to acetate, which was also produced, 
varied from 0.07 to 0.8, depending on the redox po- 
tential of the terminal electron acceptor. Alanine for- 
mation from pyruvate was shown to occur via an ala- 
nine aminotransferase, and it was suggested that this 
enzyme might operate in conjunction with gluta- 
mate dehydrogenase and a ferredoxin: NADP oxido- 
reductase to recycle the electron acceptors involved in 
catabolism. In support of this, the genes encoding the 
aminotransferase and the glutamate dehydrogenase 
appear to be coregulated at the transcriptional level, 
the expression of both being induced when the cells 
were grown on pyruvate (171). 

Similarly, Thermococcus profundus excretes L- 
alanine into the medium (91). High activity of alanine 
aminotransferase was present in the cells, but no ala- 
nine dehydrogenase activity could be detected. It was 
suggested that alanine formation may be initiated by 
ammonia incorporation by glutamate dehydrogenase, 
followed by amino transfer from glutamate to pyru- 
vate by the aminotransferase. Another Thermococcus 
species, T. kodakaraensis, has also been shown to 
produce alanine and acetate when grown on starch or 
pyruvate (53). 

The finding that members of the Thermotogales, 
one of the deepest branching genera within the Bac- 
teria, also produce L-alanine during glucose fermen- 
tation has led to the view that this may be an ances- 
tral metabolic characteristic (118). 


The citric acid cycle 


The citric acid cycle was discovered in aerobic 
organisms as the pathway through which all nutrients 
can be completely oxidized to CO, and H,0, with a 
considerably higher yield of energy (ATP) than is 
gained from either Embden-Meyerhof (EM) or the 
various Entner-Doudoroff (ED) pathways. Following 
the conversion of hexoses to pyruvate by the EM or 
ED pathways, and its subsequent decarboxylation to 
acetyl-CoA, the oxidative cycle “begins” with the 
condensation of acetyl-CoA (C2) with oxaloacetate 
(C4) to form the Cg-compound citrate (Fig. 5). The 
pathway then comprises a series of chemical transfor- 
mations to permit the loss of two carbons as CO3 
and the removal of hydrogen atoms by the cofactors 
NAD(P)* and FAD, the reoxidation of which with 
molecular oxygen yields ATP and H,O. From the re- 


maining 4-carbon compound, oxaloacetate is regen- 
erated to complete the cycle. However, intermediates 
of the cycle are removed for biosynthesis, which ne- 
cessitates the replenishment of oxaloacetate by the so- 
called anaplerotic reactions. For example, pyruvate 
generated from sugar catabolism can be partitioned 
between its oxidative decarboxylation to acetyl-CoA 
and its ATP-dependent carboxylation to oxaloacetate 
by pyruvate carboxylase, the anaplerotic route. 
Clearly, this partitioning is highly regulated in most 
organisms. 

Unlike the different routes that have evolved for 
the catabolism of glucose to pyruvate, the metabolic 
intermediates of the citric acid cycle are remarkably 
consistent throughout the Archaea, Bacteria, and 
Eucarya, making it one of the least variant of the cen- 
tral metabolic pathways in terms of its chemical 
transformations (27). However, considerable varia- 
tion is seen, in particular in archaea and bacteria, 
with respect to the “completeness” of the cycle and 
the use to which it is put (67). In turn, these varia- 
tions clearly reflect the lifestyle of each organism, in 
terms of their aerobic/anaerobic and autotrophic/ 
heterotrophic modes of growth. 


The oxidative citric acid cycle. In the aerobic ar- 
chaea, acetyl-CoA generated from pyruvate can be 
oxidized to CO, and H,O via an oxidative citric acid 
cycle (Fig. 5), and energy may be generated via ox- 
idative phosphorylation. The constituent enzymes 
have been assayed in a variety of halophilic archaea, 
and in T. acidophilum and S. acidocaldarius (2, 28), 
and the genes have been identified in the genome se- 
quences of these and of Halobacterium NRC-1, Pi- 
crophilus torridus, T. volcanium, S. solfataricus, S. 
tokadaii, P. aerophilum, Ferroplasma acidarmanus, 
and A. pernix. This is the same set of enzymes as 
those in the citric acid cycle of the aerobic bacteria 
and eucarya, although it is thought that it is the 
2-oxoglutarate oxidoreductase, and not the dehydro- 
genase complex, that converts 2-oxoglutarate to suc- 
cinyl-CoA in the Archaea. 


The reductive citric acid cycle. In some archaea, 
the citric acid cycle operates in the reductive mode to 
fix CO, during autotrophic growth (Fig. 6). Two key 
enzymes are required to reverse the cycle: 2-oxoglu- 
tarate oxidoreductase, which reductively carboxy- 
lates succinyl-CoA to 2-oxoglutarate using reduced 
ferredoxin, and ATP-citrate lyase to drive the forma- 
tion of acetyl-CoA from citrate. The acetyl-CoA can 
then be reductively carboxylated to pyruvate via 
pyruvate oxidoreductase, again using reduced ferre- 
doxin. Note that the two reductive reactions using re- 
duced ferredoxin catalyzed by the oxidoreductases 
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Figure 5. The oxidative citric acid cycle and the glyoxylate cycle. The reactions of the citric acid cycle are denoted by solid 
arrows, and the reactions unique to the glyoxylate cycle are shown with dotted lines. Enzymes are denoted by numbers: 1 = 
citrate synthase, 2 = aconitase, 3 = isocitrate dehydrogenase, 4 = 2-oxoglutarate dehydrogenase complex (aerobic bacteria 
and eucarya), 5 = 2-oxoglutarate ferredoxin oxidoreductase (archaea), 6 = succinate thiokinase, 7 = succinate dehydrogenase, 
8 = fumarase, 9 = malate dehydrogenase, 10 = isocitrate lyase, 11 = malate synthase. 


are not possible using NADH and the corresponding 
2-oxoacid dehydrogenase complexes. The operation 
of this pathway in Thermoproteus neutrophilus has 
been supported by enzymic assays and radiolabeling 
studies (132), and the required enzymes have also 
been found in Pyrobaculum islandicum (66) and 
T. tenax (149). Also note that, while autotrophic CO3 
fixation is common in the Archaea, not all members 
use the reverse citric acid cycle. A reductive acetyl- 
CoA pathway is used in the euryarchaea Archaeo- 
globus, Ferroplasma, and the methanogens (66) (see 
Chapter 13). On the other hand, members of the 
Crenarchaeota, including Metallosphaera sedula, 


Acidianus ambivalens, Acidianus brierleyi, and Sul- 
folobus species contain key enzymes of the 3-hydrox- 
ypropionate cycle for CO; fixation (66). 

A number of questions still remain on the modes 
of CO; fixation in the Archaea. For example, ribu- 
lose-1,5-bisphosphate carboxylase/oxygenase (Ru- 
bisCO) is present in halophiles, methanogens, and 
several thermophiles (46). The enzyme from these ar- 
chaea is catalytically active in a ribulose bisphos- 
phate-dependent CO,-fixation reaction, but its role in 
a reductive pentose phosphate pathway is uncertain 
because a gene for phosphoribulokinase, which gen- 
erates the substrate (ribulose bisphosphate) for the 
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Figure 6. The reductive citric acid cycle. Enzymes are denoted by numbers: 1 = malate dehydrogenase, 2 = fumarase, 3 = 
fumarate reductase, 4 = succinate thiokinase, 5 = 2-oxoglutarate ferredoxin oxidoreductase, 6 = isocitrate dehydrogenase, 


7 = aconitase, 8 = ATP citrate lyase. 


RubisCO reaction, has not been identified. Enzymic 
assays performed with alternative substrates have 
provided evidence for a previously uncharacterized 
pathway for the synthesis of ribulose bisphosphate 
from 5-phospho-b-ribose-1-pyrophosphate in M. jan- 
naschii and other methanogenic archaea (47). The 
enzyme responsible has been purified and the gene 
identified from the N-terminal protein sequence; in- 
terestingly, there are good homologs of the gene in 
methanogenic, thermophilic, and halophilic archaea. 
However, the quantitative significance of RubisCo in 
CO) fixation in these autotrophic archaea has not yet 
been assessed. 


The cycle in anaerobic archaea. In contrast to aer- 
obic archaea, anaerobic members utilize the citric 


acid cycle primarily for biosynthetic purposes, and 
therefore may not be expected to have the complete 
oxidative cycle. Identifying genes in the citric acid 
cycle is complicated by their sequence similarity to 
genes that are involved in other pathways. For exam- 
ple, isocitrate dehydrogenase (citric acid cycle) and 
3-isopropylmalate dehydrogenase (leucine metabo- 
lism) share significant sequence identity, as do the var- 
ious members of the 2-oxoacid Fd oxidoreductases. 
As a result, it is difficult to make assignments about 
substrate specificity. 

Careful analysis of the genome sequence of the 
anaerobe, T. tenax, revealed the presence of all the 
genes of an oxidative cycle, suggesting that it might 
therefore be functional under heterotrophic growth 
conditions (149). The genes for a complete cycle are 
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present in A. fulgidus (89) and some of the enzymes 
have been characterized (156). In P. furiosus, all the 
genes, except for malate dehydrogenase, are present 
in the genome sequence; however, a putative malate 
oxidoreductase gene has been annotated that may 
produce an enzyme with equivalent function (121). 
P. abyssi and P. horikoshii are quite different from 
P. furiosus as they lack the first three enzymes of the 
cycle, namely citrate synthase, aconitase, and isoci- 
trate dehydrogenase (67, 100). 

Under anaerobic conditions, the complete citric 
acid cycle requires terminal electron acceptors other 
than oxygen; these include sulfur, sulfate, thiosulfate, 
and nitrate (136). 

Other anaerobic archaea appear to have partial 
citric acid cycles. For example, in the methanogens 
partial versions of the cycle may exist that fulfil an 
anabolic function (see Chapter 13). In Methano- 
sarcina barkerii, enzyme assays show that 2-oxoglu- 
tarate can be synthesized via citrate, aconitate, and 
isocitrate, whereas in M. thermautotrophicus the en- 
zymes for an incomplete reductive cycle have been as- 
sayed to enable 2-oxoglutarate synthesis via succinate 
(10). Genome sequences confirm the presence of the 
necessary genes for these partial cycles (150), although 
homologs for some of the other citric acid cycle en- 
zymes can also be detected. Indeed, with the pre- 
diction of a novel aconitase in the Archaea (102), 
M. thermautotrophicus might have a complete set of 
genes for the citric acid cycle enzymes, although ex- 
perimental verification of enzymic activities is re- 
quired. T. kodakaraensis appears to lack the genes for 
several key cycle enzymes, but confirmatory enzymic 
analyses have not yet been performed (53). 


Regulation of the citric acid cycle. The citric acid 
cycle is a multifunctional pathway, serving to oxidize 
pyruvate to CO, and H3O, providing a variety of 
metabolites for biosynthetic reactions, and being the 
main generator of energy under aerobic conditions. 
Consequently, in bacteria and eucarya, the flux through 
the cycle is tightly regulated, with allosteric feedback 
inhibition and covalent control via phosphorylation- 
dephosphorylation being the two most common reg- 
ulatory mechanisms (reviewed in reference 82) (see 
Chapter 11). However, not all the enzymes of the cit- 
ric acid cycle are controlled in these organisms; regu- 
lation has only been consistently observed for the 
pyruvate dehydrogenase complex, citrate synthase, 
isocitrate dehydrogenase and the 2-oxoglutarate dehy- 
drogenase complex. A variety of metabolites serve to 
effect the control of these enzymes, depending on the 
organism and its nutrient source. 

In contrast to the large volume of literature de- 
scribing the regulation of the citric acid cycle enzymes 


from bacteria and eucarya, there is little information 
for the archaea. The pyruvate and 2-oxoglutarate de- 
hydrogenase complexes are replaced by the equivalent 
Fd oxidoreductases in archaea, for which no control 
mechanisms have been reported. Archaeal citrate syn- 
thases are isosterically inhibited by ATP with K; values 
similar to those reported for citrate synthases from 
gram-positive bacteria and eucarya (28); however, the 
physiological significance of this inhibition has been 
questioned (172). Finally, no regulation of an archaeal 
isocitrate dehydrogenase has been reported. 

In P. furiosus there is evidence of a strong upreg- 
ulation in the transcription of citrate synthase, aconi- 
tase and isocitrate dehydrogenase during growth on 
maltose compared with growth on peptides (139). In 
contrast, there was no upregulation of these tran- 
scripts when H. volcanii was grown on glucose after 
growth on casamino acids (178). 


Growth on Acetate and the Glyoxylate Cycle 


Several archaea can grow on acetate. The first 
metabolic step for acetate utilization is the conversion 
to acetyl-CoA catalyzed by AMP-forming acetyl-CoA 
synthetase: 


Acetate + ATP + CoA — acetyl-CoA + AMP + PP; 


In the aerobic archaea, acetyl-CoA enters the citric 
acid cycle for energy production. However, from an 
anabolic viewpoint, this poses a problem as the two 
carbon atoms will be lost as CO}, and the extraction 
of any cycle intermediate for biosynthesis would lead 
to the cessation of the pathway. One solution to this 
problem in archaea would be to reductively carboxy- 
late a portion of the acetyl-CoA to pyruvate using re- 
duced ferredoxin and the pyruvate oxidoreductase. 
Pyruvate could in turn replenish the citric acid cycle 
intermediate, oxaloacetate, via pyruvate carboxylase, 
or be converted to PEP via PEP synthase, the gene for 
which is found in all archaeal genomes sequenced. 
PEP could undergo gluconeogenesis or itself be used 
to replenish the citric acid cycle through PEP car- 
boxylase (42). 

A second solution would be the use of the gly- 
oxylate cycle (Fig. 5). The key enzymes, isocitrate 
lyase and malate synthase, have been found in H. vol- 
canii, H. marismortut, P. aerophilum, S. solfataricus 
and S. acidocaldarius. The synthesis of these enzymes 
is induced by growth on acetate in H. volcanii (143), 
and in H. marismortui acetyl-CoA synthetase is co- 
ordinately upregulated in an acetate-specific fashion 
(13). Acetate-induced induction of isocitrate lyase 
and malate synthase also occurs in S. acidocaldarius 
(162). Another halophile, Halobacterium NRC1, 
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lacks the genes for isocitrate lyase and malate syn- 
thase and accordingly cannot grow on acetate as sole 
carbon source (107). 

In archaea, the route to glucose for carbon in- 
troduced into the citric acid cycle by the glyoxylate 
cycle has not been defined but may be via PEP car- 
boxykinase and gluconeogenesis as in many bacteria. 
PEP carboxykinase catalyzes the reaction: 


Oxaloacetate + GTP/ATP — PEP + CO, 
+ GDP/ADP 


Within the Archaea, a GTP-dependent enzyme has 
been characterized in T. kodakaraensis (51), although 
this organism does not appear to have the glyoxylate 
cycle enzymes, isocitrate lyase and malate synthase. 
The transcription and activity levels of the enzyme in 
T. kodakaraensis were higher under gluconeogenic 
conditions than under glycolytic conditions, consis- 
tent with a role in the supply of PEP from oxalo- 
acetate when gluconeogenesis is operational. A gene 
for PEP carboxykinase has been tentatively identified 
in the genomes of Sulfolobus and Aeropyrum, but no 
enzymological data are available. A second exit pos- 
sibility is via malic enzyme, which reductively decar- 
boxylates malate to pyruvate; this enzyme has also 
been detected in T. Rodakaraensis and the recombi- 
nant protein characterized (52). 


Catabolism of Amino Acids 


Many archaea can take up and catabolize pep- 
tides and amino acids, and some, notably the halo- 
philic archaea and Pyrococcus, can use these as their 
sole carbon and energy sources. In the sequenced ar- 
chaeal genomes, aminotransferase genes have been 
identified. The archaeal aminotransferases presum- 
ably serve a similar role to aminotransferases of bac- 
teria and eucarya, namely to convert amino acids to 
their corresponding 2-oxoacids by a transamination 
reaction with another 2-oxoacid. For example, ala- 
nine can be transaminated to pyruvate, aspartate to 
oxaloacetate, and glutamate to 2-oxoglutarate. The 
products of the transamination reactions can then 
directly enter the pathways of central metabolism. 
Other 2-oxoacids have to undergo a series of chemi- 
cal transformations to convert them to central meta- 
bolic intermediates, although not all these pathways 
in the Archaea have been defined by enzymic studies. 

The deamination of the branched-chain ami- 
no acids, valine, leucine, and isoleucine, yields the 
branched-chain 2-oxoacids 3-methyl-2-oxo-butanoate, 
4-methyl-2-oxo-pentanoate, and 3-methyl-2-oxo- 
pentanoate, respectively. In archaea, these, in turn, 
are converted to their corresponding acyl-CoA deriva- 


tives by the branched-chain BCOR (140). Further me- 
tabolism converts them to acetyl-CoA and/or suc- 
cinyl-CoA, both of which can enter the citric acid cycle. 

As mentioned above (see “The metabolic fate of 
pyruvate”), BCOR is a member of a family of Fd- 
linked oxidoreductases that also includes the pyru- 
vate (POR) and 2-oxoglutarate (KGOR) enzymes. 
The presence of this family of active oxidoreductase 
enzymes in the aerobic archaea might be considered 
unexpected as the reactions they carry out in aerobic 
bacteria and eucarya are catalyzed by 2-oxoacid 
dehydrogenase multienzyme complexes. However, 
2-oxoacid dehydrogenase complex activity has not 
been detected in any members of the Archaea, sup- 
porting the view that their oxidoreductases are suffi- 
cient for these metabolic steps. While these observa- 
tions appear definitive, genes that potentially encode 
the components of 2-oxoacid dehydrogenase com- 
plexes have been identified in aerobic archaea. To un- 
derstand the significance of these findings, a brief dis- 
cussion of the mechanism and structure of the 
bacterial and eucaryal complexes is necessary. 

The 2-oxoacid dehydrogenase complexes cat- 
alyze the general reaction: 


2-Oxoacid + CoASH + NAD* = acyl-SCoA 
+ CO, + NADH +H+ 


Similar to the oxidoreductases, members of this 
family include the pyruvate dehydrogenase complex 
(PDHC, catalyzes the conversion of pyruvate to 
acetyl-SCoA), the 2-oxoglutarate dehydrogenase 
complex (OGDHC, 2-oxoglutatarate to succinyl- 
SCoA), and the branched-chain 2-oxoacid dehydro- 
genase complex (BCODHC, oxidatively decarboxy- 
lates the 2-oxoacids produced by the transamination 
of amino acids valine, leucine, and isoleucine). The 
complexes are all three-component systems consisting 
of multiple copies of enzymes E1 (2-oxoacid decar- 
boxylase), E2 (dihydrolipoyl acyltransferase), and E3 
(dihydrolipoamide dehydrogenase) (111, 112). E2 
forms the structural core of the complex, to which 
copies of E1 and E3 are noncovalently bound. The 
number of copies of each component can vary be- 
tween the different complexes and between phyloge- 
netic groups of any one system (68, 112). For exam- 
ple, in the PDHC from gram-negative bacteria there 
are 24 polypeptide chains of the E2 component in 
each core molecule, whereas in the complex from 
gram-positive bacteria and eucarya there are 60 E2 
chains. Most OGDHCs and BCODHCs have 24 E2 
chains in their core structures. E2 also forms the cat- 
alytic core of these multienzyme complexes, with 
each E2 polypeptide chain having at least one cova- 
lently bound acyl-carrying cofactor, lipoic acid, which 
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serves to connect the three active sites and to chan- 
nel substrate through the enzyme complex (Fig. 7). 
Substrate specificity is determined by the E1 and E2 
components, while E3 serves a common role in reox- 
idizing the enzyme-bound dihydrolipoamide pro- 
duced by acyl-SCoA formation; consequently, it is of- 
ten the same E3 gene product that can serve in the 
different 2-oxoacid dehydrogenase complexes. 

The first indications that aerobic archaea might 
contain a 2-oxoacid dehydrogenase complex came 
with the surprising finding that the third component 
of the 2-oxoacid complexes, dihydrolipoamide dehy- 
drogenase (DHLipDH), was active in halophilic ar- 
chaea (29) and in the thermophiles, T. acidophilum 
(152) and A. pernix (H.C. Aass, D. W. Hough, and 
M. J. Danson, unpublished data). The identifica- 
tion of lipoic acid in H. halobium by a combined gas- 
chromatographic and mass-spectrometric procedure 
added to the significance of the enzymological studies 
(114), in that the only known physiological function 
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of DHLipDH and lipoic acid is as part of the 2- 
oxoacid dehydrogenase complexes. 

Genes with significant sequence identities and 
conserved motifs to the Ela, E1B, E2, and E3 genes 
of bacterial and eucaryal 2-oxoacid complexes were 
identified in H. volcanii (Fig. 8) (31, 73, 169). The four 
ORFs are tightly spaced or overlapping, and a single 
ribosome-binding site and TATA box upstream of the 
ORF 1 start codon, and a transcriptional stop signal 
(poly(dT) tract) downstream of ORF 4, have been pu- 
tatively identified. A single cluster with the same gene 
content and arrangement is found in the genome se- 
quences of all aerobic archaea, with the exception of 
S. solfataricus (Fig. 8). In this organism, the cluster 
comprises Ela, E18, and E2 genes, with the latter two 
separated by a gene of unknown function; the gene en- 
coding the putative E3 component is 560 bp upstream 
of this cluster. Furthermore, the S. solfataricus E2 gene 
is interrupted by a frameshift, although this may be a 
case of programmed —1 frameshift recoding events 
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Figure 7. General mechanism of the 2-oxoacid dehydrogenase multienzyme complexes. The 2-oxoacid dehydrogenase com- 
plexes of bacteria and eucarya comprise enzymes E1 (2-oxoacid decarboxylase), E2 (dihydrolipoyl acyltransferase), and E3 (di- 
hydrolipoamide dehydrogenase). B, histidine base; Lip, enzyme-bound lipoic acid, showing the structure of the dithiolane 
ring; S—S, protein disulfide bond; TPPH, thiamine pyrophosphate. 
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Haloferax voicanii 


Thermoplasma acidophilum 


Aeropyrum pernix 


Pyrobaculum aerophilum 


Sulfolobus solfataricus 


Figure 8. Gene clusters encoding the components of a putative archaeal 2-oxoacid dehydrogenase complex. The arrangement 
and intergene distances (bp) of the ORFs constituting the Ela, E18, E2, and E3 genes of the proposed archaeal 2-oxoacid de- 
hydrogenase are shown. The proposed direction of transcription (left to right, as drawn) is the same for all the genes. See text 


for details of the (—1) frameshift in the E2 gene of S. solfataricus. 


that have been discovered in several genes of this hy- 
perthermophilic archaeon (21). 

The Ela and E18 genes from T. acidophilum 
have been coexpressed in E. coli, and the recombinant 
proteins form an a RB .-enzyme that decarboxylates 
the branched-chain 2-oxoacids 4-methyl-2-oxopen- 
tanoate, 3-methyl-2-oxopentanoate, and 3-methyl-2- 
oxobutanoate (63). A low catalytic activity is found 
with pyruvate, but the enzyme appears to be inactive 
toward 2-oxoglutarate. Similarly, the E2 gene has 
been recombinantly expressed as a 50% lipoylated 
product, and the E3 gene has been expressed as an ac- 
tive dihydrolipoamide dehydrogenase. Preliminary 
data indicate that the components assemble into an 
active complex with the same substrate specificity as 
the isolated E1 enzyme (C. Heath, H. C. Aass, D. W. 
Hough, and M. J. Danson, unpublished observa- 
tions). The findings support a role for the complex in 
the metabolism of branched-chain amino acids. 

In H. volcanii, the transcript levels for the genes 
encoding the putative 2-oxoacid dehydrogenase com- 
plex decreased when growth was changed from amino 
acid-based to glucose-based metabolism, supporting 
a role for the enzyme complex in the catabolism of 
amino acids but not glucose (178). However, inser- 
tional inactivation of the DHLipDH gene resulted in 
no detectable phenotypic difference from the wild- 
type organism when grown on a variety of central 
metabolic intermediates (73, 74). The catalytic activity 
of the halophilic complex is presently being examined 
using homologous expression of the components. 

In addition to catabolism of amino acids via their 
2-oxoacid derivatives, some halophilic archaea are 
able to ferment arginine to support anaerobic growth. 


The consumption of arginine is coupled to the equimo- 
lar production of ornithine, indicating that this may 
occur via the arginine deiminase pathway, with ATP 
being generated by substrate-level phosphorylation 
(62). The three enzymes in this pathway are arginine 
deiminase, ornithine transcarbamylase, and carba- 
mate kinase, which respectively catalyze the following 
reactions: 


L-arginine + H2O > L-citrulline + NH3 


L-citrulline + P; —> carbamoyl-phosphate 
+ L-ornithine 


carbamoyl-phosphate + ADP —> ATP + NH; + CO2 


The genes for the enzymes are present in a cluster in 
Halobacterium salinarum (128), and the identity of 
the encoded enzymes has been confirmed. 


PERSPECTIVE: THE NEXT FIVE YEARS 
Recent Developments 


Although the study of microbial metabolism 
dates back to the foundations of the disciplines of 
microbiology and biochemistry, the advent of ge- 
nome sequencing in the past decade has resulted in a 
new approach to the study of metabolic pathways. 
Annotated genome sequences are currently available 
for 20 species of archaea, with additional sequenc- 
ing projects in progress, and this information has 
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enabled comparative and functional genomics ap- 
proaches to be added to the traditional techniques of 
enzyme purification and characterization and of 
metabolic labeling. 

Publication of a new archaeal genome sequence 
often includes an integrated metabolic overview of 
nutrient uptake, energy production, biosynthesis, and 
other aspects of the biology of the organism (see ref- 
erences 89, 105, 146). Such overviews are usually in- 
complete since not all genes can be assigned an un- 
ambiguous functional annotation. In fact, all genome 
sequences published to date include a substantial 
number of hypothetical genes of unknown function 
that typically comprise up to 50% of the entire ge- 
nome. Despite these limitations, a comparative ap- 
proach, integrating genomic and functional informa- 
tion across a range of archaeal species, has been used 
to identify archaeal glycolytic pathways and the en- 
zymes involved (166). However, it is evident that 
most of the proposed gene functions await experi- 
mental confirmation, and this is an ambitious goal 
given the range of organisms and enzyme activities in- 
volved. The completion of further genome sequences 
can only add to this problem. 

The advent of genome sequencing has provided 
the background for global analysis of cell function by 
following changes in mRNA levels using DNA mi- 
croarrays (transcriptomics) and protein expression, 
using two-dimensional gel electrophoresis or high-per- 
formance liquid chromatography and mass spectrom- 
etry (proteomics) (see Chapter 20). These approaches, 
applied initially to eucaryal (35) and bacterial (87) 
systems, have subsequently been applied to archaea. 
The first archaeal whole-genome DNA microarray 
was constructed for P. furiosus, using the 2065 ORFs 
annotated in the genome sequence (139). The expres- 
sion levels of 8% of expressed ORFs were found 
to vary substantially between peptide and maltose- 
grown cells, and most of these ORFs were members of 
27 putative operons. Among the 18 operons upregu- 
lated in maltose-grown cells were those responsible 
for maltose transport and the biosynthesis of several 
amino acids. An operon encoding three enzymes of 
the citric acid cycle, citrate synthase, aconitase, and 
isocitrate dehydrogenase, was also upregulated, as 
were several ORFs encoding hypothetical proteins of 
unknown function. Several of the nine operons up- 
regulated in peptide-grown cells were involved in 
transamination of amino acids and metabolism of the 
resulting 2-oxoacids. It was also shown in a number 
of cases that there was good correlation between mi- 
croarray data and changes in enzyme activity levels. 

In the absence of a complete genome sequence, 
metabolic adaptation of H. volcanii following a 
switch from growth on amino acids to growth on glu- 


cose has been studied using a shotgun DNA micro- 
array (178). Again there were significant changes in 
expression levels for about 10% of all genes, some of 
which were expected on the basis of metabolic studies. 
The expression of genes encoding enzymes of the part- 
phosphorylative ED pathway was upregulated on glu- 
cose, as was a potential glucose transporter. Among 
the genes repressed on glucose were a number encod- 
ing proteins involved in translation and ATP synthesis. 
Also, there was a decrease in transcription of the genes 
constituting the putative 2-oxoacid dehydrogenase 
complex operon (73). Enzymic activity corresponding 
to this complex has not been detected in Haloferax, 
and its inactivation does not alter the metabolic phe- 
notype. However, the microarray data suggest a role in 
amino acid catabolism, and this is supported by recent 
work on the corresponding complex in T. acido- 
philum (63). Heterologous expression of the E1 com- 
ponent of the Thermoplasma complex yielded a prod- 
uct with decarboxylase activity typical of the first 
component of a branched-chain 2-oxoacid dehydro- 
genase complex. This and other examples indicate the 
power of microarray-based methods, in combination 
with an enzymological approach, in the discovery of 
unexpected gene functions and metabolic processes. 

Proteomics studies of archaea are still at a pre- 
liminary stage, in particular in the case of halophiles, 
since the instability of their proteins at low ionic 
strength has required the development of new meth- 
ods for two-dimensional gel electrophoresis (77) (see 
Chapter 20). At present the majority of published 
archaeal proteomics studies relate to methanogens. 
For example, in the hyperthermophilic methanogen, 
Methanocaldococcus jannaschii, 963 proteins, repre- 
senting ~54% of the genome, have been identified 
from whole-cell extracts using liquid chromatogra- 
phy and mass spectrometry methods (180). 

It is perhaps unsurprising that there are as yet 
few instances of an integrated microarray and pro- 
teomics study of an archaeal species. Baliga et al. (9) 
have analyzed mRNA and protein profiles of wild- 
type and mutant strains of Halobacterium NRC-1 
and found that in many cases changes in protein ex- 
pression are not reflected in changes at the mRNA 
level, suggesting that these proteins are regulated by 
posttranscriptional control mechanisms. Further- 
more, they concluded that two major energy produc- 
tion pathways in this organism, phototropy and argi- 
nine fermentation, are inversely regulated to control 
ATP production under anaerobic conditions 


The Next 5 Years and Beyond 


In the postgenomic era there is an increasing trend 
toward understanding the mechanisms underlying bi- 
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ological responses in the context of the whole organ- 
ism. This is the goal of Systems Biology, an approach 
that aims to integrate genomics and proteomics with 
quantitative kinetic information on enzyme activities 
and metabolite levels. The aim is then to use mathe- 
matical and computational models to construct reac- 
tion networks that can simulate cellular functions. In 
the context of metabolism this extends the concept 
of a metabolic pathway, leading to the definition of a 
complex metabolic network that exhibits properties 
that cannot be described simply in terms of its indi- 
vidual components (109). However, this approach re- 
quires a vast amount of information relating to the 
organism of interest, which cannot be derived entirely 
from genome sequencing. At present there are severe 
limitations on both the extent and accuracy of se- 
quence annotation and, even with high-throughput 
techniques, we are far from a situation where the 
function of every open reading frame can be con- 
firmed experimentally and the kinetic and regulatory 
properties of the encoded enzyme defined. 

Progress has been made by focussing on one as- 
pect of the system. For example, central metabolism 
in E. coli has been modeled in terms of in vitro en- 
zyme kinetic measurements and in vivo measure- 
ments of intracellular metabolite concentrations (20). 
Models such as this can be used to predict the effects 
of changes in nutrient supply and are an essential 
starting point for rational metabolic engineering. 
Clearly we are far from the point at which such meta- 
bolic models can be generated for an archaeal species. 
There is no model archaeal organism analogous to 
E. coli for which we have the necessary background 
data, and to accumulate this information will take a 
substantial, well-focused effort. Although genomic 
and proteomic studies will certainly be carried out on 
an ever-increasing range of archaea, the generation 
of enzyme kinetic and metabolomic data is likely to 
follow much more slowly. In some cases, in particular, 
with the hyperthermophiles, there will be additional 
factors related to enzyme and metabolite stability that 
will add a further dimension to the already complex 
models derived for mesophilic organisms. 


CONCLUDING REMARKS 


Central metabolism represents one of the most 
fundamental aspects of the biochemistry of the cell 
and is commonly perceived as invariant and sacro- 
sanct. However, in reality there is a considerable di- 
versity to the pathways, enzymes, and cofactors that 
comprise it, and this is no better illustrated than by 
considering the situation in the Archaea. The preced- 
ing text has served to catalog and evaluate the current 
knowledge of central metabolism in the Archaea, and 


a striking variability to the nature of the processes and 
pathways in different genera is revealed. Nevertheless, 
trends can be discerned and unusual or distinctive fea- 
tures of archaeal metabolism can be identified. 

Perhaps the most obvious trend in the nature of 
central metabolism in archaea is that the distribution 
of pathways and enzymes in a particular organism is 
closely linked to its specific growth environment. For 
example, halophiles use the part-phosphorylative ED 
pathway, thermoacidophiles use the nonphosphoryla- 
tive ED pathway, and hyperthermophiles use variants 
of the EM pathway. This situation appears to apply 
regardless of the position of an organism in the 
rRNA-based universal phylogenetic tree. For exam- 
ple, the crenarchaeaon S. solfataricus and the eury- 
archaeaon T: acidophilum both possess the nonphos- 
phorylative ED pathway. Both organisms grow in 
thermoacidophilic environments, and there is evidence 
of considerable lateral transfer of genes between 
them (129). The hyperthermophilic crenarchaeaon A. 
pernix possesses the EM pathway, as do euryarchaeal 
hyperthermophiles such as A. fulgidus and P. furio- 
sus. This observation provides some support for the 
proposal that metabolic genes are well suited for lateral 
transfer (173), a process that is undoubtedly favored 
between organisms that occupy a similar environ- 
mental niche. However, note that the hyperthermo- 
philic bacterium T: maritima possesses EM and ED 
pathways similar to the classical bacterial and eu- 
caryal pathways, despite evidence of large-scale lat- 
eral transfer between bacterial and archaeal hyper- 
thermophiles (7). 

As described throughout this chapter, central 
metabolic enzymes of archaea have a number of un- 
usual and unique features. Many times in the past, re- 
searchers have sought to speculate that these features 
of archaeal metabolism, in particular those found 
in hyperthermophiles, are evolutionarily ancient and 
provide an insight into the nature of central metabo- 
lism in a primitive organism. Examples of this include 
the prevalence of variants of the ED pathway (122), 
the existence of ADP-dependent kinases (83), the ex- 
istence of ferredoxin-dependent enzymes (25), the 
production of alanine (118), and the existence of 
promiscuous pathways (98). It is perhaps more likely 
that, instead of representing ancestral metabolic char- 
acteristics, these features provide some selective ad- 
vantage for survival in a particular environment. As 
such, the diversity of central metabolic pathways 
documented in the Archaea may be just a remarkable 
illustration of the evolutionary adaptation of micro- 
organisms to survival in a variety of different hostile 
growth environments. Many of the “unusual” fea- 
tures of archaeal central metabolism are also found in 
a limited number of bacterial and eucaryal species. 
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However, certain enzymes, such as ADP-dependent 
phosphofructokinase and the nonphosphorylating 
glyceraldehyde-3-phosphate oxidoreductase, have 
not yet been found in any organism other than ar- 
chaea. These enzymes may have evolved specifically 
in the Archaea, presumably because they confer a se- 
lective advantage or, alternatively, they may have 
been present prior to the divergence of the three do- 
mains of life and were replaced in bacteria and eu- 
carya during evolution. 

The discovery of the fascinating array of diverse 
central metabolic characteristics in the Archaea alone 
justifies the efforts that have been expended by a myr- 
iad of researchers over the past two decades. It is 
hoped that this text will help future workers in the 
field to draw together the different aspects of archaeal 
central metabolism and identify areas to target for fu- 
ture research. 
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Chapter 13 


Methanogenesis 


JAMES G. FERRY AND KYLE A. KASTEAD 


INTRODUCTION 


Methane-producing anaerobes (methanogens) 
were the first identified by Carl Woese to be phyloge- 
netically distinct from all other cell types, and are the 
founding members of the Archaea (see Chapter 1). 
Methane-producing species were chosen based on 
their unusual morphological and biochemical charac- 
teristics (11, 38, 60, 114-116, 162, 165, 241, 271, 
277) that guided comparisons of 16S ribosomal RNA 
sequences (64, 241) and discovery of the three-domain 
concept (270). Research on the methanogens, the 
largest group representing the Archaea, over the past 
three decades has had a major impact on our under- 
standing of the ecology, physiology, biochemistry, and 
molecular biology of this domain. Indeed, the path- 
ways for methanogenesis are one of the most ex- 
tensively studied aspects of archaeal biology and one 
of the most prominent features that distinguish this 
domain of life from the Bacteria and Eucarya. Recent 
genomic sequencing, proteomic analyses, and devel- 
opment of genetic systems continue to expand our un- 
derstanding of methanogenesis and the Archaea. 


ECOLOGY 


The Italian physicist Alessandro Volta is most of- 
ten credited with the discovery of biological methane 
formation when he performed the “Volta experi- 
ment” at Lake Como, a freshwater lake in northern 
Italy. Disturbing the sediment with a pole, he released 
the trapped methane and collected the gas bubbles in 
an inverted funnel partly submerged in the water col- 
umn. Lighting the gas produced a flame, prompting 
him to call the gas “combustible air.” Figure 1 shows 
a modern-day Volta experiment in which gas col- 
lected from the sediment of a freshwater pond is ig- 
nited on release by tipping the funnel. 


Including freshwater lakes and ponds, a signifi- 
cant fraction of the earth’s biosphere contains vast 
and diverse oxygen-free environments where anaero- 
bic microbes convert complex organic matter to 
methane and carbon dioxide, an essential link in the 
global carbon cycle (Fig. 2). The process occurs in 
habitats such as the rumen, the lower intestinal tract, 
sewage digesters, landfills, freshwater sediments of 
lakes and rivers, rice paddies, hydrothermal vents, 
coastal marine sediments, and the subsurface (264). A 
consortium of at least three interacting metabolic 
groups of anaerobes converts complex organic matter 
to the most oxidized (carbon dioxide) and reduced 
(methane) forms of carbon (Fig. 2). The process ac- 
counts for nearly one billion metric tons of biologi- 
cal methane produced annually (244). The first two 
groups (Fig. 2, step 3) are primarily from the Bacte- 
ria. The fermentative group decomposes complex or- 
ganic matter to acetate, formate, higher volatile fatty 
acids, hydrogen, and carbon dioxide. The obligate 
hydrogen-producing group decomposes the higher 
volatile fatty acids to acetate, hydrogen, and carbon 
dioxide. The third group (Fig. 2, step 4), the metha- 
nogens, convert the metabolic products of the first 
two groups to methane by two major pathways. The 
conversion of the methyl group of acetate to methane 
(acetate fermentation pathway) produces about two- 
thirds of the annual production, whereas one-third 
derives from the reduction of carbon dioxide with 
electrons supplied from the oxidation of formate or 
hydrogen (carbon dioxide reduction pathway). Thus, 
the methanogens rely on the first two groups to sup- 
ply substrates for growth and methanogenesis. Fur- 
thermore, the production of hydrogen by the fermen- 
tative and obligate hydrogen-producing groups is a 
thermodynamically unfavorable reaction, and growth 
of these groups depends on the hydrogen-utilizing 
methanogens to maintain low concentrations of hy- 
drogen in the environment. Thus, the conversion of 
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Figure 1. Reenactment of the Volta experiment illustrating 
methanogenesis in a freshwater pond. 


complex organic matter to methane requires a true 
syntrophic association of distinct metabolic groups of 
anaerobes. 


PHYLOGENY 


The domain Archaea is divided into four king- 
doms, Euryarchaeota, Crenarchaeota, Korarchaeota, 
and the recently described Nanoarchaeota (see Chap- 
ter 2). Methanogens are the main constituency of the 
Euryarchaeota and are subdivided into five orders, 
each with distinctive characteristics. 


The Order Methanobacteriales 


The order Methanobacteriales comprises two fam- 
ilies, Methanobacteriaceae and Methanothermaceae. 

With thirty-two species in four genera, the 
Methanobacteriaceae is the largest and most diverse 
family of methanogens (8, 11, 17-19, 23, 26, 30, 33, 
41, 44, 58, 100, 108, 121, 123-125, 138, 141, 142, 
153, 169, 170, 171, 195, 209, 216, 223, 247, 253, 
269, 273, 275, 278, 279, 281, 290). Cells range from 
coccoid, to filamentous rods, with most species being 
coccobacilliary or short rods. The cell wall structure 
is very similar to that of gram-positive bacteria, ex- 
cept that pseudomurein replaces muramic acid as the 
predominant peptidoglycan polymer. Methanobacte- 
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Figure 2. The global carbon cycle. Steps: (1) Fixation of carbon 
dioxide into organic matter, (2) aerobic (oxygen-dependent) de- 
composition of organic matter to carbon dioxide, (3) deposition 
of organic matter into anaerobic (oxygen-free) environments and 
decomposition to metabolic end products by fermentative and ob- 
ligate hydrogen-producing anaerobes, (4) conversion of the end 
products to methane by the methanoarchaea and escape of the 
methane to aerobic environments, (5) aerobic oxidation of methane 
to carbon dioxide by oxygen-requiring methylotrophs. 


riaceae produce energy by reducing carbon dioxide 
with electrons generated by the oxidation of hydro- 
gen, except for species of the genus Methanosphaera 
that reduce the methyl group of methanol with hy- 
drogen (19, 68, 170). Many species also use formate, 
and two also use secondary alcohols as a source of re- 
ductant (44, 281). Utilizing CO, as their sole carbon 
and energy source, species of the genera Methanobac- 
terium and Methanothermobacter are autotrophic. 
Most can also grow in the absence of any organic 
compounds, and some are even capable of fixing dini- 
trogen. Species of Methanosphaera and Methanobre- 
vibacter are heterotrophic, as they require acetate as a 
source of cell carbon. Many species of Methanobac- 
teriaceae can be considered enteric organisms, as they 
have been isolated from the digestive tracts and feces of 
animals, as well as from sewage sludge digesters. All 
species are mesophilic except for those of the genus 
Methanothermobacter that are thermophilic. Nearly 
all species are neutrophilic, having an optimum pH of 
6.8 to 8.0. However, there are two alkalophilic and 


290 FERRY AND KASTEAD 


one acidophilic species of Methanobacterium as well as 
one acidophilic species of Methanobrevibacter (123, 
195, 209, 273). Of particular interest is Methanobac- 
terium subterraneum, which is not only alkalophilic 
but also the only species that is halotolerant, forms ag- 
gregates, and of all known methanogens, the only one 
to be isolated from granitic groundwater (123). 

The family Methanothermaceae contains but a 
single genus, Methanothermus, composed of only two 
species (137, 236). Cells are rod shaped and motile by 
bipolar flagellar tufts. The cellular envelope is dou- 
bly layered, consisting of an inner pseudomurein layer 
similar to that of the family Methanobacteriaceae and 
an outer protein-containing S layer (see Chapter 14). 
Methanothermus fervidus cells are straight rods that 
occur in pairs or short chains (236). Methanothermus 
sociabilis cells are curved rods growing in large 
aggregates (137). Reduction of carbon dioxide to 
methane with hydrogen is the only means of energy 
production. Both organisms are autotrophic, al- 
though organic supplements such as yeast extract 
may stimulate growth. Both species were isolated 
from solfataric water sources, are neutralophilic, and 
as the name Methanothermaceae implies, are hyper- 
thermophilic, having temperature optima of 83 and 
88°C (137, 236). 


The Order Methanococcales 


Methanococcales is an order of coccoid marine 
species that contains two families, Methanococcaceae 
and Methanocaldococcaceae. 

Methanococcaceae are a small family containing 
only two genera and five species (11, 43, 96, 107, 
120, 159, 234, 239). Cells are irregular cocci, occur- 
ring singly and in pairs, motile by a polar tuft of fla- 
gella, and forming a protein cell wall or S layer. All 
Methanococcaceae reduce carbon dioxide to methane 
by using hydrogen and formate as electron donors. 
With the exception of Methanococcus voltae, which 
requires acetate for its cell carbon, all species are 
autotrophic (11, 234). Species of the genus Methano- 
coccus can utilize sulfide and elemental sulfur as their 
sulfur source and ammonium as their nitrogen 
source; furthermore, Methanococcus maripaludis is 
also capable of fixing N» (107). And the impressive 
species Methanothermococcus thermolithotrophicus 
can grow on nearly any sulfur and nitrogen source, 
which includes sulfide, elemental sulfur, thiosulfate, 
sulfite, sulfate, ammonium, nitrate, and N (96). All 
species of Methanococcaceae were isolated from ma- 
rine or estuarine environments, with Methanother- 
mococcus found in geothermally heated sea sedi- 
ments or deep-sea hydrothermal vents. Despite these 


habitats, most species are only halotolerant, prefer- 
ring to grow at lower salinities. 

The family Methanocaldococcaceae comprises 
only two genera and six species, most of which were 
transferred out of the family Methanococcaceae, dif- 
ferentiated primarily by their optimum growth tem- 
peratures (29, 35, 101, 102, 106, 132, 240, 287). 
Like the Methanococcaceae, cells are motile, irregular 
cocci occurring singly or in pairs. Another difference 
is that, while still using carbon dioxide as the sole car- 
bon and energy source, Methanocaldococcaceae are 
only able to reduce it with hydrogen, with only one 
species capable of utilizing formate (240). Ammonium 
and sulfide serve as the nitrogen and sulfur sources 
and members of the genus Methanocaldococcus also 
require selenium. All species of Methanocaldococ- 
caceae were isolated from deep-sea hydrothermal 
vents, and pursuant with this habitat, are hyperther- 
mophiles, with all species exhibiting growth tempera- 
ture optima between 80 and 88°C. These organisms 
are slightly acidophilic and prefer salt concentrations 
near seawater levels. The second genus, Methanotor- 
ris, contains only two species that are separated from 
all other Methanocaldococcaceae by the absence of 
both motility and the requirement for selenium, and 
by being slightly more halotolerant (NaCl range: 0.45 
to 7.2%) than other species (35, 240). 


The Order Methanomicrobiales 


The order Methanomicrobiales contains the 
familes Methanomicrobiaceae, Methanocorpuscul- 
aceae, and Methanospirillaceae. 

With twenty-three species in seven genera, 
Methanomicrobiaceae is one of the larger and more 
diverse families of methanogens (9, 11, 24, 40, 43, 50, 
66, 85, 133-135, 155, 168, 174, 187-189, 198, 203- 
205, 233, 250, 266, 267, 274, 276, 282-285). Cell 
morphology ranges from cocci, to short rods, to the 
disc- or plate-shaped cells of the genus Methanoplanus. 
Cell walls are made of proteins and do not contain 
either peptidoglycan or pseudomurein. All species re- 
duce carbon dioxide to form methane by using hydro- 
gen and formate, and many species are also capable of 
using secondary alcohols. Nearly all species are hetero- 
trophic, with acetate being the most common carbon 
source. Methanomicrobiaceae have been isolated from 
a wide range of anaerobic habitats, and their physio- 
logical diversity reflects this. While most species are 
mesophiles, Methanoculleus thermophilicus (“ Methan- 
ogenium frittonii”), isolated from the effluent channel 
sediment of a nuclear power plant and from the sedi- 
ment of Fritton Lake in England, is thermophilic (85, 
204). There are two psychrophilic species, Methanogen- 
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ium marinum, isolated from Skan Bay, Alaska, and 
Methanogenium frigidum, isolated from sediments of 
Ace Lake in Antarctica (40, 66). M. frigidum is also one 
of two halophilic organisms, the other being Methan- 
ocalculus halotolerans, which was isolated from the 
well head of an oil field (66, 188). There are also three 
halotolerant organisms: Methanoculleus submarinus 
from deep-sea sediments of the Nankai Trough, 
Methanofollis tationis from a sulfataric field, and 
Methanocalculus chunghsingensis from a marine water 
aquaculture fish pond (135, 168, 276). 

The Methanocorpusculaceae family, wherein 
Methanocorpusculum is the lone genus containing four 
species, is one of the smallest families of methanogens 
(29, 190, 275, 280, 286, 288). Cells are irregular cocci 
and occur singly or in aggregates. Cell walls are an 
S layer of hexagonally arranged units, and with the ex- 
ception of Methanocorpusculum labreanum, all species 
of this family are motile by flagella (288). Growth oc- 
curs by the reduction of carbon dioxide to methane us- 
ing hydrogen or formate, although Methanocorpuscul- 
um parvum and Methanocorpusculum bavaricum use 
secondary alcohols as reductants (280, 286). All species 
are heterotrophic, requiring acetate, peptones, or ru- 
men fluid to serve as carbon and nitrogen sources. All 
species are mesophilic and neutralophilic. The type 
species M. parvum was isolated from an anaerobic sour 
whey digester inoculated with sewage sludge (280). 
M. bavaricum and Methanocorpusculum sinense were 
isolated from the industrial waste water from a sugar 
factory and a distillery, respectively, whereas M. labre- 
anum was isolated from the surface sediments of Tar Pit 
Lake of the Labrea Tar Pits (286, 288). 

Methanospirillaceae is one of two families of 
methanogens comprised of only a single genus and 
species (60). Methanospirillum hungatei cells are 
symmetrically curved rods that form wavy filaments. 
The cells are motile by polar tufts of flagella. The cell 
wall consists of an S layer of nonglycosylated poly- 
peptides and is enveloped by a rigid paracrystalline 
outer sheath that may encase several cells. Formate or 
hydrogen serves as an electron donor for the reduc- 
tion of carbon dioxide to methane. M. hungatei is 
best known for its association with syntrophic anaer- 
obes that require the removal of hydrogen and for- 
mate, products of their metabolism, for continued 
growth (46). M. bungatei fixes N, and autotrophic 
growth is possible, although yeast extract and pep- 
tones stimulate growth. This organism is mesophilic 
with a temperature optimum of 37°C and neu- 
tralophilic with a very narrow pH range for growth 
(between 6.6 and 7.4). M. hungatei was isolated from 
sewage sludge, and a recently discovered strain, 
TM20-1, which may be designated as a new species, 
was isolated from a rice paddy field (246). 


The Order Methanosarcinales 


The order Methanosarcinales contains the fami- 
lies Methanosarcinaceae and Methanosaetaceae. 

Currently, the family Methanosarcinaceae com- 
prises eight genera and twenty-six species (3, 12, 21, 
22, 28, 30, 32, 45, 56, 60, 67, 103, 109-111, 122, 
148, 149, 151, 154, 156-160, 163, 178, 180, 181, 
186, 191, 192, 196, 197, 207, 217, 220, 228, 229, 
232, 235, 251, 268, 289, 291-293, 295, 296). Cells 
are coccoidal or pseudosarcinal, forming a protein cell 
wall devoid of peptidoglycan or pseudomurein and, 
in some cases, are surrounded by a heteropolysaccha- 
ride sheath to form large aggregates. All species can 
grow and produce methane by dismutating methyl 
compounds such as methanol, methyl amines, and 
methyl sulfides. Most also convert the methyl group 
of acetate to methane and reduce carbon dioxide with 
hydrogen as electron donor; thus, the family is the 
most metabolically versatile of all methanogens. Fur- 
thermore, CO can also be utilized by some species 
(207). Ammonium and sulfide serve as the major ni- 
trogen and sulfur sources, and cofactor supplements 
such as biotin are required by some. Species of 
Methanosarcinaceae have been isolated from very di- 
verse anaerobic habitats, and unsurprisingly, exhibit a 
wide range of physiological optima. Most species are 
mesophilic, but Methanosarcina thermophila, isolated 
from a 55°C anaerobic digestor, and Methanomethyl- 
ovorans thermophila, isolated from a methanol-fed 
anaerobic sludge reactor, are thermophilic, while Me- 
thanococcoides burtonii, isolated from Ace Lake in 
Antarctica, Methanosarcina baltica, isolated from the 
Gotland Deep of the Baltic Sea, and Methanococcoides 
alaskense, isolated from Skan Bay, Alaska, are psy- 
chrophilic (67, 103, 220, 251, 296). Since many 
species have been isolated from marine environ- 
ments, they prefer a salinity near that of seawater; 
however, Methanohalobium evestigatum, isolated 
from the sediments of a saline lagoon, is an extreme 
halophile, growing only at a NaCl concentration 
higher than 15.2% with an optimum of 25.1% (292). 
Furthermore, species of the genus Methanohalo- 
philus are moderately halophilic ([NaCl] optimum 
between 5.0 and 12%), balancing their cytosolic os- 
molarity with the environment by synthesizing or- 
ganic osmolytes such as glycine or betaine. With only 
three exceptions, all species of Methanosarcinaceae 
are neutralophilic, having pH optima between 6.5 
and 7.5. The three exceptions are alkaliphilic (pH 
optimum greater than 8.0), and all three were iso- 
lated from saline environments; Methanolobus ore- 
gonensis is halotolerant (exhibiting highest growth 
rate at lower salinities, but continuing to grow in 
[NaCl] up to 9.0%), while Methanolobus talorii and 


292 FERRY AND KASTEAD 


Methanosalsum zhilinae are moderately halophilic 
(148, 163, 191). 

The family Methanosaetaceae is composed of 
only two species within a single genus (27, 97, 112, 
113, 185, 193, 194, 248, 294). Methanosaeta species 
are nonmotile, straight rods that grow in chains and 
sometimes long filaments and are enclosed in a tubu- 
lar sheath separated by structures called spacer plugs. 
Acetate serves as the only substrate for methanogene- 
sis. Ammonium and sulfide serve as the nitrogen and 
sulfur sources, and both species also require cofactor 
supplements in the form of biotin or vitamins. Metha- 
nosaeta concilii, isolated from sewage sludge, is meso- 
philic, has an optimum pH of 7.0, and is inhibited by 
yeast extract (194). Methanosaeta thermophila, iso- 
lated from sludge of an anaerobic, thermophilic biore- 
actor, is thermophilic, with an optimum temperature of 
55°C, is slightly acidophilic with a pH optimum of 6.0, 
and grows better in the presence of trace metals (185). 
This genus was previously identified as Methanothrix. 
However, the type species of this genus, Methanothrix 
soehngenii, was later discovered to be impure when 
initially characterized, and no cell culture collection 
contains a pure, viable culture. As the type species of 
the genus Methanothrix was invalidated, the rules of 
the International Code of Nomenclature of Bacteria 
state that the genus has no nomenclatural standing, 
and the name Methanothrix was rejected and Methan- 
osaeta was adopted in its place (27, 194). 


The Order Methanopyrales 


The order Methanopyrales is composed of only 
one family, Methanopyraceae, and a single genus and 
species (130). Methanopyrus kandleri are motile rods 
that grow singly or in chains. The cell wall is composed 
of pseudomurein, and cells are also sheathed in a 
“fuzzy” coat of unknown composition. Methano- 
pyraceae reduce carbon dioxide with hydrogen and are 
autotrophic, with ammonium and sulfide serving as ni- 
trogen and sulfur sources. M. kandleri, isolated from 
hydrothermally heated deep-sea sediment, is a hyper- 
thermophile with an optimum temperature of 98°C. It 
has a pH optimum of 6.5 and an optimum NaCl con- 
centration of 2.0% (130). The taxonomic position of 
M. kandleri has been the subject of some controversy. 
Based on its 16S rDNA sequence, M. kandleri was 
placed in a deep branch of the Euryarchaeota, which is 
quite distant from all other methanogens. However, 
since the complete genome has been sequenced, there 
have been analyses of the translation machinery as well 
as the transcription machinery. These studies indicate 
that, while M. kandleri is still significantly different 
enough to be within its own family, it groups with the 
Methanococcales and Methanobacteriales (31). 


PATHWAYS OF METHANOGENESIS 


Coenzymes and Cofactors 


Figure 3 shows the cofactors and coenzymes in- 
volved in methanogenic pathways, several of which 
are also found in organisms outside the Archaea. 
Methanofuran (MF) (143) and tetrahydrometha- 
nopterin (THMPT) (118) function as one-carbon car- 
riers, the latter coenzyme also functioning in methy- 
lotrophic microbes from the Bacteria domain (39). 
Molybdopterin guanine dinucleotide, common in en- 
zymes from the Bacteria and Eucarya domains, was 
first discovered in the Archaea as a cofactor of for- 
mate dehydrogenase (104, 164) from Methanobac- 
terium formicicum, and later as a prosthetic group of 
formyl-MF dehydrogenase (117). Coenzyme F420 
(F420), a deazaflavin derivative (55), is an obligate 
two-electron carrier accepting or donating a hydride 
ion. Methanophenazine (MP) (1) is a 2-hydroxy- 
phenazine derivative connected by ether linkage to a 
polyisoprenoid side chain that functions as a mem- 
brane electron carrier. Factor II is a cofactor of sev- 
eral methyltransferases and has a structure similar to 
vitamin B12, a major exception being that factor III 
contains a 5-hydroxybenzimidazolyl base (211). Fac- 
tor III functions in methyltransferases by accepting a 
methyl group as the upper axial ligand to the cobalt. 
Coenzyme M is the smallest cofactor known (241). 
The methylated form of (CH3-S-CoM) is the sub- 
strate for the methylreductase, which catalyzes the 
reductive demethylation of CH3-S-CoM to methane 
in all methanogenic pathways. Coenzyme B (CoB-SH) 
is the second substrate for methylreductase, provid- 
ing electrons for the reaction (183). Cofactor F430 
(F430), the prosthetic group of methylreductase, was 
the first nickel-containing cofactor to be described 
(52, 265) and belongs to the family of macrocyclic 
tetrapyrroles (51), distinguished by a much higher 
degree of saturation than other members of the fam- 
ily (heme, siroheme, chlorophyll, bacteriochlorophyll, 
and corrinoids). 

Considerable progress has been made on the 
biosynthesis of several of the above-mentioned cofac- 
tors, including the characterization of enzymes in the 
biosynthetic pathways (77, 79-82, 95, 202, 212, 225, 
226, 255-262). 


The Carbon Dioxide Reduction and Acetate 
Fermentation Pathways 


The production of methane is the energy-yielding 
metabolism of methanogens. Two major pathways 
account for most of the methane produced biologi- 
cally. Approximately two-thirds derive from the 
methyl group of acetate (reaction 1) by the “acetate 
fermentation” pathway (Fig. 4) and approximately 
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Figure 4. Steps in the acetate fermentation pathway. Substrates and 
products are shown in bold. 


one-third by the reduction of carbon dioxide with elec- 
trons from either hydrogen or formate (reactions 2a or 
2b) in the “carbon dioxide reduction” pathway (Fig. 5). 


CH;¿COO- + H* —> CH4 + CO; (1) 
CO, + 4H, —> CH, + 2H,0 (2a) 
4HCO3H > 3CO, + CH,+2H,O (2b) 


Several recent reviews describe both pathways in de- 
tail, including the description of other minor path- 
ways (47, 48, 59, 78, 86, 215). 


Reactions leading to methane that are common 
to both pathways 


Reactions 3 to 5 are common to both major path- 
ways (1 and 2a and b) that differ primarily in steps 
by which the methyl group is generated and passed to 
THMPT to form CH3-THMPT (Figs. 4 and 5). 


CH3-THMPT + HS-CoM -> 
CH3-S-CoM + THMPT 
CH;-S-CoM + HS-CoB —> CoMS-SCoB + CH, (4) 


CoMS-SCoB + 2e— + 2Ht = 
HS-CoB + HS-CoM (5) 
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Figure 5. Steps in the carbon dioxide reduction pathway. Substrates 
and products are shown in bold. 


Reaction 3 is catalyzed by >N-methyl- THMPT:CoM 
methyltransferase and is the subject of an excellent re- 
view (75). The methyltransferase is integral to the 
membrane and couples the exergonic methyl transfer 
reaction to generation of a sodium ion gradient across 
the membrane that could be used for various energy- 
requiring reactions. The enzyme contains eight non- 
identical subunits (MtrA to H) of which MtrA con- 
tains factor III and is thought to protrude into the 
cytoplasm. It is proposed that transfer of the methyl 
group from CH3-THMPT to factor II is catalyzed by 
MtrH, whereas MtrE is postulated to demethylate 
factor III. It is hypothesized that conformational 
changes induced in MtrA by methylation and demethy- 
lation are transmitted to MtrE, which then drives the 
translocation of sodium. 
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The methyl-CoM reductase (Mcr) catalyzes reaction 
4 utilizing HS-CoB as the electron donor, producing 
heterodisulfide CoMS-SCoB in addition to methane. 
Some species harbor genes encoding two isozymes 
designated MRI and MRII. The crystal structure of 
the MRI enzyme from the carbon dioxide-reducing 
species Methanothermobacter marburgensis (Metha- 
nobacterium thermautotrophicum strain Marburg) 
reveals an a78>y2-subunit structure that forms two 
active sites, each containing F439 (57). Based partly on 
arrangement of substrates in the structure, a mecha- 
nism has been proposed for which the first step is a 
nucleophilic attack of [F439]Ni(I) on CH3-S-CoM, 
forming a [F439]Ni(III)-CH3 intermediate and HSCoM. 
In the next step, electrons are transferred from HS- 
CoM to Ni(II) producing the thiyl radical eS-CoM 
and [F430|Ni(II)-CH3. Methane is then produced by 
protonolysis of [F439]Ni(II)-CH3, and the thiyl radical 
is coupled to ~S-CoB to form CoB-S-S-CoM accom- 
panied by a one-electron reduction of Ni(II). In the 
concerted mechanism, specific conformational changes 
ensure entry of CH3-S-CoM adjacent to F439 and be- 
fore entry of HS-CoB in the narrow active site chan- 
nel (61, 76). Investigation of the dependence of tem- 
perature on activity indicates an alternating sites 
mechanism for release of CoMS-SCoB that is driven 
by conformational changes transmitted from the ad- 
jacent monomer (70). Under debate is an alternative 
to the first step (71), in which Ni(I) attacks the sulfur 
of CH;3-S-CoM, producing a free methyl radical re- 
acting with HS-CoB to produce methane and the thiyl 
radical ¢S-CoB. A novel methyl-CoM reductase, con- 
taining a modified F439, has been implicated in the 
first step of anaerobic methane oxidation by archaea 
present in anaerobic microbial communities that oxi- 
dize methane (214). 

In both pathways, the disulfide bond of CoMS- 
SCoB is reduced by heterodisulfide reductase (Hdr), 
yielding the active sulfhydryl forms of the coenzymes 
(reaction 5). In both pathways, reduction of CoMS- 
SCoB is coupled to formation of an electrochemical 
proton gradient which drives ATP synthesis catalyzed 
by an A,Ao-type ATP synthase (177). Two types of 
Hdr have been described, the two-subunit HdrDE 
and the three-subunit HdrABC. The HdrABC type 
functions in the carbon dioxide reduction pathway 
(87, 89), and the HdrDE type is found in acetate- 
grown cells of M. thermophila (218) and acetate- and 
methanol-grown Methanosarcina barkeri (90, 128). 
The HdrE contains cytochrome b, which is proposed 
to transfer electrons to HdrD. HdrD and HdrBC are 
highly conserved (89, 128), suggesting they are the 
catalytic subunits of their respective enzymes. Spec- 
troscopic investigations indicate a 4Fe-4S center in 
HdrD and HdrB is the active site where reduction of 


the disulfide occurs in two one-electron steps involv- 
ing a thiyl radical intermediate (88). 


Synthesis of CH3-THMPT in the carbon dioxide 
reduction pathway 


The six electrons required for carbon dioxide re- 
duction to CH3-THMPT derive from the oxidation of 
either hydrogen or formate (reactions 6 to 8). 


H, + 2 Fd™ — 2 Fd'ed + 2H*+ (6) 
2H + 2F420 > 2F420H2 (7) 
2HCOOH + 2F 420 = 2F420H2 + 2CO, (8) 


Reaction 6 is coupled to the first step in the pathway 
where reduced ferredoxin (Fd"°4) donates electrons 
for carbon dioxide reduction to the formy] level, an 
endergonic reaction. Thus, under low partial pres- 
sures of hydrogen encountered in the environment, 
the reduction of Fd°* requires an input of energy. The 
enzymes catalyzing reaction 6 are different between 
Methanosarcina species (47) and “obligate carbon 
dioxide reducing” species such as M. marburgensis 
and Methanothermobacter thermautotrophicus (Me- 
thanobacterium thermautotrophicum strain AH) 
(237). In Methanosarcina species, reaction 6 is cat- 
alyzed by the Ech hydrogenase that utilizes Fd as the 
electron acceptor. The enzyme is a six-subunit com- 
plex with sequence identity to multisubunit hydroge- 
nases in the Bacteria domain and subunits of the 
complex I energy-conserving NADH:quinone oxi- 
doreductases. Thus, it is proposed that Ech hydroge- 
nase depends on the proton gradient to drive the re- 
duction of Fd by reverse electron transport (86). The 
4Fe-4S clusters of the Ech from M. barkeri were as- 
signed to subunits EchC and EchF (63), for which the 
electron paramagnetic resonance signals are pH de- 
pendent (129), suggesting a role in proton transloca- 
tion. The enzyme catalyzing the energy-dependent 
reaction 6 in “obligate carbon dioxide reducing” 
species has not been established. The genomes do not 
encode an Ech hydrogenase, although these species 
encode two multisubunit membrane-bound hydro- 
genases (Eha and Ehb). These enzymes have sequence 
identity to Ech, and the genes are clustered with 
ferredoxins, leading to the proposal that Eha and 
Ehb function in analogy to Ech. The only formate 
dehydrogenases characterized from the methanogens 
are the F4.9-dependent enzymes from M. formici- 
cum (210) and Methanococcus vannielii (105), al- 
though the mechanism by which electrons are trans- 
ferred from F4 9H) to Fd has not been investigated. 
Reaction 7 is catalyzed by an F499-dependent hydro- 
genase (4, 65) with properties distinct from the Ech 
hydrogenase. 
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The steps in involving conversion of carbon dioxide 
to CH3-THMPT are shown in reactions 9 to 13. 


CO, + MF + 2Fdte4 + 2Ht > 


2Fd°* + formyl-MF + H2O a 
formyl-MF + THMPT —> (10) 
N®-formyl-THMPT + MF 

N°-formyl-THMPT + H* —> (11) 

N3, N!°-methenyl-THMPT* + HO 
N5, N!-methenyl-THMPT* + F420H2 => 
N°, N!°-methylene-THMPT (12a) 
+ F420 + Ht 
N5, N!°-methenyl-THMPT* + H, > (12b) 
N°, N'°-methylene-THMPT + H* 
N°, N!°-methylene-THMPT + F420H2 > (13) 


N°-methyl-THMPT ga F420 


In reaction 9, carbon dioxide attaches to MF and is 
then reduced to formyl-MF, with Fd"°4 catalyzed by 
formyl-MF dehydrogenase. Reaction 10 is catalyzed 
by formyl-MF:THMPT formyltransferase, and reac- 
tion 11 by methenyl-THMPT cyclohydrolase (Mch). 
Mutants of M. barkeri with mch disrupted hold 
promise for a genetic analysis to further probe the 
physiological function of this enzyme (83). Two 
mechanistically distinct N°,N!°-methylene- THMPT 
dehydrogenases catalyze reaction 12, the first of 
which oxidizes reduced F459 (F429H>) (reaction 12a), 
and the second utilizes hydrogen as the electron 
donor (reaction 12b). The hydrogen-utilizing enzyme 
is essentially a hydrogenase that catalyzes the stereo- 
specific transfer of a hydride ion from hydrogen into 
the methylene carbon. The enzyme contains no iron- 
sulfur clusters or nickel in contrast to all known hy- 
drogenases, although it contains a functional iron 
(213) and a low-molecular-mass cofactor of unknown 
composition (152). N°,N?°-Methylene-THMPT re- 
ductase catalyses reaction 13. The crystal structure 
of the reductase (10) shows the isoalloxazine ring of 
F420 present in a pronounced butterfly conformation 
induced from the Re-face of F420 by a bulge contain- 
ing a nonprolyl cis-peptide bond. 


Synthesis of CH3-THMPT in the acetate 
fermentation pathway 


Two genera are known that ferment acetate to 
methane, Methanosaeta and Methanosarcina, of 
which the latter has been extensively investigated. In 
Methanosarcina species, CH3-THMPT is synthesized 
by reactions 14 to 17. These reactions are widespread 
in diverse anaerobes; thus, an understanding of the en- 
zymes has broad significance outside methanogenesis. 


CH;COO- + ATP > 
CH3CO,PO3” + ADP 


CH3CO,PO037 + HS-CoA > 
CH3COSCoA + P; 


CH3COSCoA + THMPT 
+ HLO + Fd°* > CH;-THMPT (16) 
+ Fdte4 + CO, + HS-CoA 


CO, + H,O > HCO; + Ht (17) 


(14) 


(15) 


Acetate kinase and phosphotransacetylase catalyze 
reactions 14 and 15, functioning in concert to convert 
acetate to acetyl-CoA. Features of the crystal struc- 
ture (36) of acetate kinase from M. thermophila 
suggest the enzyme belongs to the Acetate and Sugar 
Kinase/Hsc70/Actin (ASKHA) superfamily of phos- 
photransferases and is possibly the founding member. 
Kinetic and biochemical analyses of site-specific 
amino acid variants of acetate kinase have identified 
residues essential for catalysis and established a direct 
in-line mechanism for transfer of the phosphate from 
ATP to acetate (74, 98). The crystal structure of phos- 
photransacetylase complexed with CoA-SH (99, 140) 
has revealed the active site architecture that, together 
with kinetic analyses of the wild-type (139) and site- 
specific amino acid variants, has identified essential 
residues leading to a proposed mechanism for cataly- 
sis. The concerted mechanism proceeds through base- 
catalyzed generation of ~S-CoA, followed by nucle- 
ophilic attack of the thiolate anion on the carbonyl 
carbon of acetyl phosphate. The five-subunit CO de- 
hydrogenase/acetyl-CoA synthase (Cdh) complex 
cleaves the C—C and C—S bonds of acetyl-CoA, 
transfers the methyl group to THMPT, oxidizes the 
carbonyl group to carbon dioxide, and reduces Fd 
(reaction 16). The active site of the enzyme contains 
nickel in an unusual metal complex (201). A carbonic 
anhydrase, Cam, converts carbon dioxide to bicar- 
bonate outside the cell (reaction 17), facilitating re- 
moval of the product from the cytoplasm. Cam is the 
prototype of an independently evolved class of car- 
bonic anhydrases with a left-handed B-helical fold 


and contains iron in the active site (249). 
Electron transport and energy conservation 


Two recent reviews (47, 86) describe this topic in 
detail. In the carbon dioxide reduction pathway, the 
only reactions yielding enough energy for ATP synthe- 
sis are reactions 3 and 5, which are exergonic by —29 
and —40 kJ/mol, respectively. The H;:CoMS-SCoB 
oxidoreductase systems are different between Metha- 
nosarcina species (47) and “obligate carbon dioxide 
reducing” species such as M. marburgensis and M. 
thermautotrophicus (237). In Methanosarcina mazei, a 
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F4.9-nonreducing hydrogenase (VhoAGC) oxidizes 
hydrogen and reduces MP (Fig. 6). The VhoAG sub- 
units catalyze the oxidation of Hj, whereas VhoC 
contains a b-type heme that is proposed to donate 
electrons to MP. The reduced MP transfers electrons 
to the HdrDE-type heterodisulfide reductase, which 
reduces CoMS-SCoB to the sulfhydryl forms of the 
cofactors. The lipid-soluble quinonelike MP functions 
to translocate 2 protons to outside the cell membrane, 
generating a gradient that drives ATP synthesis cat- 
alyzed by a A,Ao-type ATP synthase. An additional 
2 protons are translocated by the Vho hydrogenase 
for a total of 4 protons translocated by the H;:CoMS- 
SCoB oxidoreductase system. The “obligate carbon 
dioxide reducing” species do not contain cytochromes 
or MP, and the H,:CoMS-SCoB oxidoreductase sys- 
tem (Fig. 7) is apparently only composed of a cyto- 
plasmic F4.9-nonreducing hydrogenase (MvhAGD) 
that is tightly bound to the soluble HdrABC type of 
heterodisulfide reductase (237). Evidence is reported 
that HdrABC is anchored to the membrane by the 
HdrB subunit (89), suggesting the H,:CoMS-SCoB 
oxidoreductase system is loosely bound to the mem- 
brane. The MvhD subunit is proposed to transfer 
electrons to the HdrA subunit of the heterodisulfide 
reductase (237). Conclusive evidence that this H3: 


OUT membrane 


CoMS-SCoB oxidoreductase system is coupled to en- 
ergy conservation has yet to be reported. However, the 
genome of Methanosphaera stadtmanae, which is only 
able to reduce CH3-SCoM with hydrogen (see below), 
contains genes encoding HdrABC and MvhAGD of 
the Hj:CoMS-SCoB oxidoreductase system consistent 
with a role in the generation of a proton gradient that 
drives ATP synthesis (68). 

In the acetate fermentation pathway of Metha- 
nosarcina species, a Fd'*4:CoMS-SCoB oxidoreduc- 
tase system pumps protons, providing the gradient 
which drives ATP synthesis. The Fd'®¢ is generated 
by oxidation of the carbonyl group of acetyl-CoA, cat- 
alyzed by the Cdh complex (reaction 16). In the fresh- 
water species M. barkeri, it is proposed that Fd'*4 do- 
nates electrons to the Ech hydrogenase (Fig. 8), which 
produces hydrogen and pumps protons (86). This 
contention is supported by gene knockout experi- 
ments (167). It is presumed that the H;:CoMS-SCoB 
oxidoreductase then oxidizes H, and reduces the het- 
erodisulfide, thus adding two additional proton- 
translocating segments. However, it is unlikely that 
the amount of energy available from the fermentation 
of acetate to methane (AG°’ = —36 kJ/CH4) can sup- 
port three proton-translocating segments, especially 
when considering that one ATP is expended in con- 
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Figure 6. The H2:CoMS-SCoB oxidoreductase system in Methanosarcina species. MP, methanophenazine. 
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Figure 7. Predicted topology of the H,:CoMS-SCoB oxidoreductase system in obligate carbon dioxide-reducing species. 


verting acetate to acetyl-CoA in the first step of the 
pathway. Although the Ech hydrogenase has not been 
identified in the freshwater isolate M. thermophila, a 
HdrDE type of heterodisulfide reductase has been pu- 
rified from membranes of acetate-grown cells with 
associated hydrogenase activity (218), suggesting a 
H>:CoMS-SCoB oxidoreductase system. A role for 
the soluble electron-transferring protein Isf (iron- 
sulfur flavoprotein) has been proposed for M. ther- 
mophila, in which Isf mediates electron transfer from 
Fd'*¢ to the membrane-bound electron transport 
chain (136), but the electron acceptor to Isf has not 
been identified. The crystal structure of Isf shows an 
unusual compact cysteine motif ligating the 4Fe4S 
cluster that is proposed to accept electrons from Fd"°4 
and donate to the noncovalently bound FMN (5). In 
summary, the Fd"*¢:CoMS-SCoB oxidoreductase sys- 
tem of freshwater marine Methanosarcina species is 
not fully understood. 


Recent evidence suggests the marine species 
Methanosarcina acetivorans employs a different strat- 
egy for oxidizing Fd" and reducing CoMS-SCoB 
(145) that is more compatible with the thermodynamic 
constrains of the pathway (Fig. 9). The genome of 
M. acetivorans does not encode a functional Ech (69), 
and evidence (179) suggests hydrogen is not an oblig- 
atory intermediate in electron transport. Further- 
more, gene knockout experiments suggest M. ace- 
tivorans utilizes an electron transport chain unique 
from M. barkeri (83). Proteomic analyses (145) indi- 
cate that acetate-grown M. acetivorans preferentially 
synthesizes subunits encoded by a six-gene cluster 
with sequence identity to the energy-converting Rnf 
complex described in species from the Bacteria do- 
main (127). Gene knockouts confirm that the com- 
plex is essential for growth on acetate (W. Metcalf, 
personal communication). Transcriptional analysis 
shows that the six-gene cluster encoding the complex 
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Figure 8. Predicted topology and function of the Fd:CoMS-SCoB oxidoreductase system in Methanosarcina barkeri. MP, 


methanophenazine. 


is cotranscribed with two flanking genes, one of 
which encodes a cytochrome c (MA0658 in Fig. 9) 
shown to dominate in acetate-grown cells (145). It is 
proposed (but not experimentally verified) that Fd"°4 
donates electrons to the Rnf complex containing cy- 
tochrome c, and that MP mediates electron transfer 
between the complex and HdrDE, which pumps two 
protons outside the cell (Fig. 9). It is also hypothe- 
sized that the Rnf complex pumps an unknown num- 
ber of protons, contributing further to the proton gra- 
dient which drives ATP synthesis. 


Pathways for Methanogenesis by Dismutation 
of the Methyl Groups of One-Carbon Compounds 


In addition to the carbon dioxide reduction and 
acetate fermentation pathways, Methanosarcina and 
Methanococcoides species (230) convert the methyl 
groups of simple one-carbon compounds, such as 
methanol and methylamines, to methane and carbon 
dioxide for growth. Investigations of these pathways 
have revealed novel biochemistry and molecular 
biology. 


Reactions leading to methane 


Unlike the carbon dioxide and acetate fermenta- 
tion pathways, the methyl group is transferred to HS- 
CoM by a system composed of two methyltransferases 
specific for methanol and monomethylamine (Fig. 10). 
The first (MT1) transfers the methyl group to the cor- 
rinoid cofactor of a protein that is the substrate for the 
second methyltransferase (MT2) that transfers the 
methyl group from the corrinoid protein to HS-CoM. 
The CH3-S-CoM is reductively demethylated by 
methyl-CoM reductase producing methane and CoB- 
S-S-CoM as described above for the two major path- 
ways. Electrons for the reduction are supplied by oxi- 
dation of the methyl group of CH3-S-CoM by reversal 
of the carbon dioxide reduction pathway (Fig. 5). One 
methyl group is oxidized to carbon dioxide to yield six 
electrons for the conversion of three methyl groups to 
methane, as shown for methanol in reactions 18 and 
19. Thus, the pathways are a dismutation of methyl 
groups to carbon dioxide and methane. 


CHOH + HO => CO, +6e +6 H* (18) 
3CH30H + 6e +6H* — 3CH, + 3H20 (19) 
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Figure 9. Postulated topology and function of the Fd:CoMS-SCoB 
oxidoreductase system in Methanosarcina acetivorans. Please see 
reference 145 for a detailed understanding. The numbers corre- 
spond to open reading frames of the gene cluster encoding the 
Ma-Rnf complex. MP, methanophenazine. 


In the first step of the oxidation arm of the dismuta- 
tion pathways, methyl-THMPT:CoM methyltrans- 
ferase transfers the methyl group of methyl-CoM to 
THMPT that is an endergonic reaction driven by a 
sodium ion gradient. However, deletion of the operon 
encoding the methyltransferase in M. barkeri pro- 
duces a mutant still able to produce methane from 
methanol, suggesting an alternative to the first step 
in the oxidation of the methyl group of methanol in- 
volving an unknown methyltransferase (254). As the 
mutant is unable to grow with methanol, the methyl 
group oxidation pathway involving the novel methyl- 
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Figure 10. Steps in the pathway for dismutation of methanol and 
monomethylamine to carbon dioxide and methane. 


transferase is unlikely to be physiologically relevant in 
the native environment. Furthermore, the mutant is 
able to grow and produce methane from methanol 
with electrons supplied by the oxidation of acetate to 
carbon dioxide and formate. However, the role of ac- 
etate in the metabolism of methanol in nature is in 
doubt since wild-type Methanosarcina species pref- 
erentially utilize methanol to the exclusion of acetate 
when both are present. 

M. stadtmanae is a methanol-utilizing species 
with an unusual methanogenesis pathway (170). The 
genome sequence is missing 37 open reading frames 
that are present in the complete genome sequences of 
all other methanogens. Genes are absent for molyb- 
dopterin synthesis (Fig. 3), which is required for 
formyl-MF dehydrogenase. This absence explains 
why the organism is unable to oxidize the methyl 
group of methanol to supply electrons for reduction 
of the methyl group of methanol to methane and re- 
lies on hydrogen as an electron donor (68). 


Electron transport and energy conservation 


Studies on the mechanism of energy conservation 
in the pathway of methanogenesis from methyl-con- 
taining growth substrates have revealed novel elec- 
tron transport components in the Archaea (47). As 
mentioned above (Reactions leading to methane), 
one-fourth of the methyl groups are oxidized to car- 
bon dioxide by a reversal of the carbon dioxide re- 
duction pathway (Fig. 5) to supply electrons for con- 
version of three-fourths of the methyl groups to 
methane (reactions 18 and 19). The electron accep- 
tor in this oxidation is F479, and the membrane- 
bound F4 9H, dehydrogenase is the first component 
of the F4.9H2:CoMS-SCoB oxidoreductase system 
(Fig. 11), generating a proton gradient that drives ATP 
synthesis (13). As for all other systems in Methano- 
sarcina species, MP mediates electron transfer to 
the HdrDE heterodisulfide reductase (14). Genomic 
analysis indicates the F429H» dehydrogenase is a thir- 
teen-subunit complex (FpoABCDHIJKLMNOF) with 
identity to the proton-translocating Complex INADH: 
quinone oxidoreductases found in organisms from the 
Bacteria and the Eucarya (13, 49). Comparison of the 
deduced sequences with Complex I (13) suggest FpoA- 
JKMN constitutes the proton transporter module, 
and FpoBCDIHLOF couples electron transfer from 
F420H2 to MP (Fig. 11). FpoF and FpoO have no iden- 
tity to subunits of Complex I, suggesting functions 
unique to the F420H2 dehydrogenase. FpoF, encoded 
separately from the fpoABCDHIKLMNO operon, is 
postulated to accept electrons directly from F429Ho, 
and FpoO to interact with MP (47). 
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Figure 11. Postulated topology and function of the F429H2:CoMS- 
SCoB oxidoreductase system in Methanosarcina mazei. MP, 
methanophenazine. 


MOLECULAR BIOLOGY OF 
METHANOGENESIS PATHWAYS 


As for all archaea, gene expression in the metha- 
nogens is a hybrid of features from both the Bacteria 
and Eucarya (227). As in the Bacteria, genes encoding 
proteins in methanogenic pathways are present in 
operons transcribed as polycistronic mRNA, and a 
Shine-Dalgarno-like sequence directs translation initi- 
ation (see Chapter 8). However, the RNA polymerase 
is structurally homologous to that of the Eucarya and 
recognizes promoters that contain an AT-rich TATA 
box element similar to the Eucarya (15) (see Chapter 
6). The TATA box is often preceded by a purine-rich 
B-recognition element (BRE) that facilitates the 
proper orientation of transcription (16). TBP (TATA 
box-binding protein) and TFB (transcription factor 
B), homologs of eucaryal general transcription fac- 
tors, bind the TATA box and BRE elements and re- 
cruit the archaeal RNA polymerase to the promoter 
(15). Genes encoding three distinct TBPs are reported 
in the M. acetivorans genome (69), suggesting the 
possibility that pairings of different TBPs with the 
TFB is a novel mechanism for gene regulation in this 
metabolically diverse species. Although the transcrip- 
tion apparatus of the Archaea and Eucarya are simi- 
lar, the sequencing of genomes from methanogens re- 
veals several transcription factors homologous to 
regulators in the Bacteria (7, 34, 49, 68, 69, 91, 131, 


221, 222). Thus, a major question is how the bacter- 
ial-like regulators interact with the eucaryal-like basal 
transcription apparatus. However, regulatory pro- 
teins other than bacterial homologs also function in 
the Archaea. In in vitro experiments, the eucaryal- 
like regulator, Tfx, was shown to bind downstream 
of the promoter for the fmdECB operon which en- 
codes the molybdenum form of the formyl-MF dehy- 
drogenase that functions in the carbon dioxide reduc- 
tion pathway (92). Furthermore, a novel repressor 
(NrpR), regulating genes for nitrogen assimilation in 
M. maripaludis, represents a new family of regulators 
unique to the Archaea (146, 147). Thus, regulation of 
transcription in the methanogens complex and invites 
further investigation. 


Regulation of Genes in the Carbon Dioxide 
Reduction Pathway 


The expression of several genes in the carbon 
dioxide reduction pathway depends on the levels of 
hydrogen during growth. Northern blot analyses 
indicate that fed-batch cultures of M. thermauto- 
trophicus supplied with excess hydrogen transcribe 
genes encoding the hydrogen-dependent metheny]l- 
THMPT reductase (reaction 12b) and isozyme II of 
methyl-CoM reductase (MRII) (173). In contrast, un- 
der conditions where hydrogen levels were reduced 
and growth was limited, the genes encoding the 
F420H2-dependent methenyl-THMPT reductase (re- 
action 12a) and isozyme I of methyl-CoM reductase 
(MRI) were transcribed and the growth yields were 
greater than for cultures supplied with excess hydro- 
gen. The results are consistent with a lower apparent 
Km for hydrogen for the F499-reducing hydrogenase 
compared with that of the hydrogen-dependent meth- 
ylene-THMPT dehydrogenase (245). Experiments 
conducted with chemostat-cultured M. marburgensis 
revealed the same expression pattern for the gene en- 
coding the F499H -dependent methylene-THMPT de- 
hydrogenase; however, expression of the gene encod- 
ing the hydrogen-dependent methylene-THMPT 
dehydrogenase was not dependent on the levels of hy- 
drogen, suggesting the regulation of this gene is dif- 
ferent in M. thermautotrophicus and M. marburg- 
ensis (2). In a more ecologically relevant experiment, 
M. thermautotrophicus was cocultured with obligate 
hydrogen-producing microbes in which the hydrogen 
partial pressure was maintained at a measurable low 
level (20 to 80 Pa) (150). Here again, only the gene 
encoding MRI was expressed under hydrogen-limit- 
ing growth conditions. The expression of genes en- 
coding the formate dehydrogenase (reaction 8) of 
M. thermautotrophicus strain Z-245 increases when 
hydrogen becomes limiting in cultures growing on 
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hydrogen and carbon dioxide (184). Consistent with 
these results is the report that M. maripaludis con- 
tains two sets of genes encoding formate dehydroge- 
nase, and quantitative lacZ fusion experiments show 
both sets are downregulated when cells are grown 
with hydrogen plus carbon dioxide (272). Further- 
more, when both formate and hydrogen plus carbon 
dioxide were present, expression of both sets were 
downregulated compared with growth on formate 
alone, which indicates that the regulation of tran- 
scription is controlled by hydrogen and not formate. 
At least for these two species, the results are consis- 
tent with a response that allows growth on formate 
when hydrogen is limiting in the environment. Hy- 
drogen partial pressures also control the expression 
of genes encoding flagella in Methanocaldococcus 
jannaschii (Methanococcus jannaschii) (175). Fla- 
gella synthesis occurred when hydrogen was reduced 
from 178 kPa to 650 Pa, consistent with a role for 
this chemotactic response in adaptation to changing 
concentrations of hydrogen in the environment. In 
summary, although details of the transcriptional reg- 
ulation are unknown, it appears that the energy- 
yielding metabolisms of carbon dioxide-reducing 
species adjust to different hydrogen concentrations in 
the environment to maintain an ecological advan- 
tage. Finally, the regulation of genes in response to 
the hydrogen partial pressure suggests a signal-sens- 
ing and transmission mechanism that has yet to be 
investigated. 

Several lines of evidence indicate that the avail- 
ability of metals regulate the expression of genes en- 
coding metalloproteins essential for the carbon dioxide 
reduction pathway. The regulation of hydrogenase- 
encoding genes in M. voltae depends on the levels of 
selenium in the growth medium. M. voltae synthe- 
sizes two types of hydrogenases, F4.9-reducing and 
non-F499-reducing, for which there are selenium-con- 
taining and selenium-free isozymes of each type. The 
genes encoding both types of selenium-free hydroge- 
nases are only transcribed under selenium limitation 
and linked by a common intergenic region contain- 
ing activator and repressor binding sites (182). A 55- 
kDa putative regulatory protein binds to the positive 
element (176), and a LysR-type regulator (HrsM) 
binds to the negative element (238). Northern blot 
analysis indicates that hrsM is autoregulated. 

A similar pattern of regulation of selenocysteine- 
containing and selenium-free hydrogenases by sele- 
nium is observed in M. maripaludis (206). However, 
when the selB gene essential for selenocysteine inser- 
tion is inactivated, selenium no longer exerts a regu- 
latory effect, indicating that free selenium is not the 
effector molecule but rather a selenium-containing 
molecule. The genome sequence of M. kandleri is an- 


notated with two types of tungsten-containing 
formyl-MF dehydrogenases (FwuB and FwcB) whose 
active sites contain either selenocysteine (FwuB) or 
cysteine (FwcB). Northern blot and primer-extension 
analyses indicate that, when grown in selenium-sup- 
plemented medium, only fwuB is transcribed; how- 
ever, in media with low concentrations of selenium, 
both fwuB and fwcB are transcribed (252). The role 
of selenium or selenium-containing effector molecules 
in the regulation of selenoproteins of M. voltae has 
yet to be investigated. The presence of molybdenum 
in the growth medium is essential for transcription 
of genes encoding the molybdenum-containing for- 
mate dehydrogenase of M. formicicum (263) and the 
molybdenum-containing formyl-MF dehydrogenase 
(reaction 9) of M. thermautotrophicus (93), results 
consistent with a role for molybdenum as an effector 
molecule in regulation of transcription. Finally, when 
M. marburgensis is cultured in a chemostat with low 
concentrations of nickel that limit growth, there is an 
increase in gene expression and enzyme activity levels 
of the hydrogen-dependent methylene-THMPT dehy- 
drogenase (2). Furthermore, levels of the nickel-con- 
taining F429H -dependent hydrogenase decreased. The 
results are consistent with the inability of the nickel- 
containing hydrogenase to supply F420H2 for the 
F499H>-dependent methylene-THMPT dehydrogenase 
that is compensated by increased levels of the hydro- 
gen-dependent methylene-THMPT dehydrogenase. 


Regulation of Genes in the Acetate 
Fermentation Pathway 


Development of a plasmid-mediated lacZ fusion 
reporter system in M. acetivorans has revealed that 
the operon encoding the Cdh complex (reaction 16) 
of M. thermophila is 54-fold downregulated in M. 
acetivorans grown on methanol compared with ac- 
etate, a result consistent with a unique role for this 
complex in the acetate fermentation pathway and 
preference for growth on methanol (6). The results 
are consistent with an earlier report on the regulation 
of cdhA in M. thermophila examined by Northern 
blotting (231). The genome sequences of M. acetivo- 
rans (69) and M. mazei (49) contain duplicate cdh 
operons with greater than 95% identity. The presence 
of two operons with high sequence identity raises the 
question as to whether both are transcribed during 
growth on acetate. A proteomic analysis of M. ace- 
tivorans indicates that one complex is expressed at 
least 16-fold over the other (145). The results are con- 
sistent with the predominance of a single Cdh com- 
plex in acetate-grown M. mazei strains Gol and C-16 
and expression of the cognate operon examined by 
Northern blotting. Although M. mazei strains Gol 
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and C-16 encode a duplicate cdh operon with high 
identity, evidence for expression was not reported 
(54). In contrast, global transcriptional profiling of 
M. mazei Gol suggests both cdh operons are tran- 
scribed at approximately equal levels (94). Thus, ex- 
pression of the duplicate operons in M. mazei strain 
Gol is unresolved. 

Ferredoxin accepts electrons from the Cdh com- 
plex of acetate-grown Methanosarcina species (62, 
242, 243), and the gene encoding the ferredoxin from 
M. thermophila is downregulated in methanol- versus 
acetate-grown M. thermophila (42). Also essential 
and unique to the acetate fermentation pathway are 
acetate kinase and phosphotransacetylase (reactions 
14 and 15), for which the encoding genes form an 
operon that is downregulated in methanol-grown 
compared with acetate-grown M. thermophila (219). 
Global proteomic and DNA microarray analyses of 
M. mazei and M. acetivorans (94, 144, 145) confirm 
the downregulation of genes essential and unique to 
the acetate fermentation pathway when cells are pre- 
sented with methanol. This regulation in response to 
the growth substrate is consistent with the greater 
amount of energy available during growth with 
methanol (AG°’ = —106 kJ/mol CH4) versus acetate 
(AG°— = —36 kJ/mol) (244), providing cells with an 
ecological advantage. 


Regulation of Genes in One-Carbon 
Dismutation Pathways 


M. burtonii is a cold-adapted species that utilizes 
methanol and methylamines for growth and methano- 
genesis. A proteomic and transcriptional analysis of 
cold adaptation has revealed the thermal regulation 
of several genes essential for methanogenesis by dis- 
mutation of the methyl group of trimethylamine (72, 
73). In particular, compared with growth at 23°C, 
cells grown at 4°C have elevated levels of F42gH> de- 
hydrogenase (Fig. 11) which generates a proton gra- 
dient driving ATP synthesis. A sodium motive force 
is also generated by the methyl-THMPT:CoM me- 
thyltransferase (reaction 3) during growth on methy- 
lamines. However, at low temperatures, a proton 
motive force is easier to maintain, which is one ex- 
planation for elevated levels of F429)H» dehydrogenase 
when cells are grown at 4°C. Cells also upregulate 
synthesis of the F429-dependent hydrogenase at the 
higher growth temperature, suggesting a role for hy- 
drogen in electron transport. One possibility is that 
the hydrogen-dependent methylene-THMPT dehy- 
drogenase (reaction 12b) functions at the higher tem- 
perature during oxidation of the methyl group, and 
the hydrogenase oxidizes the hydrogen with reduc- 
tion of F420, which supplies electrons to the proton- 


translocating F420H2 dehydrogenase. The extent of 
regulation of genes essential for methanogenesis and 
other fundamental processes in response to tempera- 
ture is consistent with a role in providing the cell with 
an ecological advantage in cold environments. 

As mentioned above (see “Regulation of genes in 
the acetate fermentation pathway,” above), growth on 
methanol is preferred over acetate. Thus, it is not sur- 
prising that proteomic and DNA microarray analy- 
ses show that genes essential for methanogenesis from 
methanol are upregulated when Methanosarcina 
species are cultured with this growth substrate (94, 
144). Of particular interest is the regulation of mul- 
tiple homologs of genes encoding methanol-specific 
methyltransferases MT1 and MT2 (Fig. 10). All 
Methanosarcina species contain an unusually large 
number of duplicated genes, which raises questions re- 
garding expression and function of the gene products. 
The genome sequences of M. acetivorans, M. mazei, 
and M. thermophila harbor two homologs of mtaA, 
three homologs of mtaB, and three homologs of 
mtaC encoding enzymes specific for methanogenesis 
from methanol. Proteomic analyses of M. ther- 
mophila have established that none of the genes are 
silent and that methanol-grown cells express mtaA-1, 
mtaB-1, mtaB-2, mtaB-3, mtaC-1, mtaC-2, and 
mtaC-3, whereas acetate-grown cells express mtaB-3 
and mtaC-3 (53). The authors speculate that synthe- 
sis of the methyltransferases in acetate-grown cells fa- 
cilitates the switch from acetate to methanol as the 
growth substrate. A genetic analysis of multiple ho- 
mologs of genes encoding the methyltransferases in 
M. acetivorans confirm results obtained with M. ther- 
mophila (200). Genetic analyses further showed that 
the AmtaCB1 AmtaCB2 deletion mutant of M. ace- 
tivorans cultured with methanol grew more slowly 
than wild-type, suggesting all three MT1 methyl- 
transferases are required for wild-type growth on 
methanol. 


Pyrrolysine, the 22nd Amino Acid 


A most interesting outcome of the biochemical 
characterization of the methyltransferase systems is 
the discovery of the 22nd amino acid, pyrrolysine 
(Fig. 12), that was first found in the MtmB subunit 
of MT1 catalyzing transfer of the methyl group of 
monomethylamine to the corrinoid cofactor of the 
MtmC subunit of MT1 (84) (see Chapter 9). Since 
then, the cognate enzymes for dimethylamine and 
trimethylamine (MtbB and MttB) are reported to 
have UAG-encoded pyrrolysine residues (73, 224). 
Although a function for pyrrolysine has yet to be de- 
termined (126), multiple copies of mtmB, mtbB, and 
mttB containing in-frame UAG codons are reported 
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Figure 12. Pyrrolysine ((4R,5R)-4-methyl-pyrroline-5-carboxylate) 
present in MtmB, MtbB, and MttB. 


in M. barkeri, M. mazei, M. acetivorans, and M. ther- 
mophila, supporting an essential function in the first 
step of the pathway for growth with methylated 
amines (49, 53, 69, 208). A mutant, unable to trans- 
late UAG, grows on methanol and acetate but not 
methylamines, indicating that the role for pyrrolysine 
is restricted to growth on methylamines (161). Fur- 
thermore, computational prediction of sequences-en- 
coding pyrrolysine has only identified a small number 
of genes in selected genomes with the potential to en- 
code pyrrolysine (37). Thus, it would appear that the 
22nd amino acid is not widely used in nature and 
may be specific to the metabolism of methylamines. 
Although pyrrolysine is encoded by a potential stop 
codon, similar to selenocysteine (UGA), the mecha- 
nism of incorporation for these two unusual amino 
acids into proteins may be different (126). Whereas 
selenocysteine is synthesized attached to tRNA, 
pyrrolysine is charged directly onto a dedicated tRNA 
(199). The mechanism of translating UAG is un- 
known and under investigation. A role for UAG in 
stopping translation has not been reported for 
Methanosarcina species, consistent with UAG func- 
tioning as a sense codon, the same as for all other 
standard amino acids; however, a role for cis- and 
trans-acting factors cannot be ruled out. 


PERSPECTIVE: THE NEXT FIVE YEARS 


Although a wealth of information has accumu- 
lated over the past few decades on the enzymology 
of one-carbon reactions in methanogenesis, less is 
known about the details of membrane-bound elec- 
tron transport and the structure/function of electron 
transfer proteins. The unusual structures revealed in 
the past for enzymes involved in one-carbon transfor- 
mations portend novel structures and functions for 


electron transfer proteins and electron carriers. Ad- 
vances in determining the crystal structures of mem- 
brane proteins are likely to be applied to the 
methanogens, facilitating discovery. Genomic se- 
quencing of several methanogens has revealed that a 
large percentage of the proteome contains hypotheti- 
cal proteins, many of which are likely to be involved 
in methanogenic pathways (34, 49, 68, 69, 91, 221, 
222). Together with robust genetic systems paving the 
way for gene knockouts (20, 166), the discovery of 
novel proteins and their functions is at a threshold. 
Another understudied area ripe for discovery is gene 
regulation, in particular, the discovery and role of reg- 
ulatory proteins in the context of a Eucarya-like basal 
transcription system. The finding that species related 
to Methanosarcina and Methanococcoides are pre- 
sent in anaerobic methane-oxidizing consortia (25), 
and that the process may be a reversal of me- 
thanogenesis pathways involving similar enzymes 
(172, 214), opens an entirely new field of research. 
Anaerobic methane oxidation, together with the ex- 
citing finding that plants produce methane aerobically 
(119), promises an unmatched flood of discovery re- 
vealing still more novel proteins and biochemical 
principals. 
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Chapter 14 


Proteinaceous Surface Layers of Archaea: 
Ultrastructure and Biochemistry 


HELMUT KONIG, REINHARD RACHEL, AND HARALD CLAUS 


INTRODUCTION 


At the end of the 1970s a group of “prokary- 
otes” was recognized as a third domain of life dis- 
tinct from the Bacteria and Eucarya (159, 160) (see 
Chapter 1). This domain was named Archaea, re- 
flecting the fact that many species were found to live 
in habitats resembling the conditions of the early 
Earth (i.e., archaic). They comprise the methanogens, 
the extreme halophiles, the thermoacidophiles, sul- 
fate- and/or sulfur-metabolizing thermophiles and 
hyperthermophiles (see Chapter 2). Archaea thrive in 
anaerobic niches, salt lakes, marine hydrothermal 
systems, and continental solfataras. However, by us- 
ing 16S rDNA oligonucleotides as specific probes, 
they have also been found in many so-called moder- 
ate ecosystems, such as soil (13), freshwater (124), 
seawater, and marine sediments (34, 35). It has been 
estimated that Archaea make up about 20% of the 
marine picoplankton (67). According to our current 
knowledge, the archaeal domain consists of four 
main phylogenetic lineages: the Crenarchaeota, the 
Euryarchaeota, the Nanoarchaeota, and the Kor- 
archaeota. 

In early studies (see below) it became obvious 
that the phylogenetic diversity of the Archaea was 
also reflected in a remarkable diversity of cell enve- 
lope types (Fig. 1) (9, 39, 63, 64, 89, 100). All Archaea 
lack muramic acid and a lipopolysaccharide-contain- 
ing outer membrane (see the notable exception of 
Ignicoccus in “The outer membrane of Ignicoccus 
and the surface layer of Nanoarchaeum,” below). 
They also lack a universal cell wall polymer. The cell 
walls of the Archaea are composed of different poly- 


mers such as glutaminylglycan, heterosaccharide, 
methanochondroitin, pseudomurein, protein, glyco- 
protein, or glycocalyx. The most common archaeal 
cell envelope is composed of a single protein or gly- 
coprotein surface layer (S layer), which is directly as- 
sociated with the cytoplasmic membrane. The thermo- 
plasmas lack a cell wall, but they possess a glycocalyx 
instead. Some cell envelope types are restricted to the 
Archaea. However, similar building blocks may be 
found in natural polymeric compounds (e.g., con- 
nective tissue) in members of the other two domains 
of life. 

As early as the 1950s, significant differences be- 
tween typical bacterial cell walls and those of Ar- 
chaea were established during cell envelope investiga- 
tions of Halobacterium (54). Subsequent work with 
cells of Sulfolobus (150, 151), Halococcus (18), and 
Methanosarcina (61) also showed unusual structures. 
The S-layer glycoprotein of Halobacterium salinarum 
was the first glycoprotein discovered in bacteria and 
archaea (97, 98). Initially, these novel cell wall struc- 
tures were viewed as curiosities, and their taxonomic 
significance was not realized until the concept of the 
Archaea was published (159). At this time, the results 
of cell wall studies supported the new view of the 
phylogeny of the Bacteria and Archaea. 

Many archaea possess proteinaceous surface lay- 
ers (S layers), which form two-dimensional regular 
arrays (6, 11, 63, 64, 101, 139). The chemical struc- 
ture of archaeal S-layer glycoproteins has been deter- 
mined in detail for a few archaeal species, e.g., 
Methanothermus fervidus (64, 66), H. salinarum and 
Haloferax volcanii (88, 138, 139), and Staphylother- 
mus marinus (116). 
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Representative Genera 


Methanococcus, Halobacterium, Pyrodictium 
Sulfolobus, Thermoproteus 


Methanospirillum 


Thermoplasma 


Methanosarcina 


Methanobacterium, Methanosphaera, 
Methanobrevibacter, Halococcus, 
Natronococcus 


Methanothermus, Methanopyrus 


Figure 1. Cell wall profiles of Archaea. CM, cytoplasmic membrane; GC, glycocalyx; GG, glutaminylglycan; HP, het- 
eropolysaccharide; LP, lipoglycan; MC, methanochondroitin; PM, pseudomurein; PS, protein sheath; SL, S layer. 


GLYCOSYLATED S LAYERS OF (HYPER-) 
THERMOPHILIC ARCHAEA 


Most of the known (hyper-)thermophilic and 
also many mesophilic species of the Archaea have 
been shown to possess S layers (64). The morpholog- 
ical building blocks are composed of six, four, three, 
or two subunits of one type of a (glyco-)protein. Ac- 
cordingly, the symmetry axis with the highest symme- 
try is p6, p4, p3, or p2 (Fig. 2; Table 1). 

In most species, the S layer is the sole cell enve- 
lope component outside the cytoplasmic membrane, 
to which it is anchored by an elongated, filamentous 
protrusion, spanning the quasi-periplasmic space be- 
tween the S layer and the membrane. The center-to- 
center spacing varies among the known genera and 


CENA 


Figure 2. Scheme showing the arrangement of S-layer subunits. 
From the left: p2, p3, p4, and p6. 


species between about 11 nm in several species of the 
order Methanococcales (M. vannielii and M. thermo- 
lithotrophicus [85, 109]) and 30 nm, as found in sev- 
eral species of the order Thermoproteales, such 
as Thermoproteus tenax (99) or Pyrobaculum islandi- 
cum (117). The molecular masses of the surface pro- 
teins range from 40 to 325 kDa (101). 

The two- and three-dimensional structures of iso- 
lated S-layer sheets have been studied in numerous 
species by electron crystallography: Sulfolobus acido- 
caldarius (31, 92, 141), Sulfolobus solfataricus (119), 
Sulfolobus shibatae (91), Acidianus brierleyi (8), 
T. tenax (99, 157), P. islandicum (117), Pyrobaculum 
organotrophum (118), Desulfurococcus mobilis (158), 
Pyrodictium occultum (51), Pyrodictium brockii (37), 
Hyperthermus butylicus (7), and Archaeglobus fulgi- 
dus (73). Complementary structural information was 
obtained by studying freeze-fractured or freeze-etched 
cells, ultrathin sections, and freeze-dried and heavy- 
metal shadowed isolated S-layer sheets. Structural in- 
formation at lower resolution is available for some 
newer isolates such as Pyrobaculum aerophilum (146), 
Pyrodictium abyssi (121), Picrophilus oshimae (125), 
Ferroglobus placidus (48), Pyrolobus fumarii (14), and 
Archaeoglobus veneficus (57). 
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Table 1. Characteristic structural features of archaeal S layers 


Center-to-center 


Width of 


Genus/species Symmetry (ain) periplasm: (nim) Source 
Crenarchaeota 
Sulfolobales 
Sulfolobus sp. p3 20-21 20-25 6, 91,92 
Metallosphaera sp. p3 21 25 45 
Acidianus brierleyi p3 21 25 8 
Thermoproteales 
Pyrobaculum organotrophum p6 ~30 25 118 
Pyrobaculum islandicum p6 ~30 25 117 
Thermoproteus tenax p6 ~30 25 99 157 
Thermofilum sp. p6 27 25 Rachel, unpublished results 
Desulfurococcales 
Desulfurococcus mobilis p4 18 30 157 
Staphylothermus marinus p4 36 60-70 116 
Aeropyrum pernix p4 19 30 Rachel, unpublished results 
Pyrolobus fumarii p4 18.5 45 14 
Pyrodictium sp. p6 21 35 37,915 121 
Hyperthermus butylicus p6 25 ~20 7 
Ignicoccus sp. - - 20-400 120 
Nanoarchaeota 
Nanoarchaeum equitans p6 16 20 56 
Euryarchaeota 
Thermococcales 
Thermococcus sp., Pyrococcus sp. p6 ~15-20 ~10 7, 36, 58, 71, 84 
Methanobacteriales 
Methanothermus fervidus p6 20 15-20 This work” 
Methanococcales 
Methanothermococcus thermautotrophicus p6 11 =10 109 
Methanocaldococcus jannaschii p6 11.5 10 109 
Methanococcus vannielii, M. voltae p6 10-11 ~10 85, 109 
Archaeoglobales 
Archaeoglobus sp. p6 17.5 10 57, 73 
Ferroglobus placidus p4 23 ~10 48 
Thermoplasmatales 
Picrophilus sp. p4 20 40 125 
Halobacteriales 
Halobacterium salinarum p6 18 6.5 77, 144 
Haloferax volcanii p6 18 6.5 144 
Methanomicrobiales 
Methanoplanus limicola p6 14.7 5-10 24 
“See text and Fig. 12. 
In some studies, the arrangement of the subunits Crenarchaeota 
is characteristic of the genus (e.g., Pyrodictium versus Sulfolobales 


Pyrolobus versus Hyperthermus; Archaeoglobus ver- 
sus Ferroglobus; Thermoplasma versus Picrophilus). 
In other cases, it is similar or identical in all species 
of the genera belonging to a family (Thermoproteus 
and Pyrobaculum within the Thermoproteaceae; Sul- 
folobus, Acidianus, and Metallosphaera within the 
Sulfolobaceae). 

These results indicate that, at least to a limited 
extent, the S-layer structure correlates with the or- 
ganism’s phylogeny (6), as determined by 16S rDNA 
sequencing (see Chapter 2). In the following sections, 
the S-layer structures of (hyper-) thermophilic archaea 
are described in a phylogenetic context. 


The S layer of S. acidocaldarius (150, 151) was 
first isolated by lysing the cells with sodium dodecyl 
sulfate (SDS; 0.15%), digestion with DNase, and re- 
peated treatment of the cell-shaped S-layer sheets 
with SDS. The isolated S layers were disintegrated 
with phosphate buffer pH 9 at 60°C, and the solubi- 
lized protein was purified by chromatography on 
Sepharose (102). Chemical analysis revealed that the 
S layer was composed of a single glycoprotein occur- 
ring in two modifications of apparent molecular 
masses of about 140 and 170 kDa, respectively. The 
glycoproteins contained high levels of serine and thre- 
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onine and low levels of basic amino acids and di- 
caroxylic amino acids (102). Evidence was provided 
that a small protein subunit may anchor the S layer to 
the cell membrane of S. acidocaldarius (47). 

The first electron micrographs (31, 141) of puri- 
fied S layers of S. acidocaldarius were interpreted as 
showing that the subunits were arranged on a hexag- 
onal lattice, with a two-sided plane group p6, and a 
22-nm unit cell dimension. The three-dimensional 
structure of the S layer was elucidated by electron 
crystallography to about 2-nm resolution (31, 141). 
The three-dimensional (3D) model showed dimeric 
building blocks, arranged to form a series of hexago- 
nal and triangular holes. This first 3D structure of an 
archaeal S layer already identified a feature that has 
subsequently been found to occur in many archaeal S 
layers; the external surface was fairly smooth, while 
the surface facing to the interior of the cell appeared 
sculptured, with large dome-shaped cavities and pro- 
truding “pedestals,” which are now known to anchor 
the S layer into the lipid bilayer of the cytoplasmic 
membrane (6). The protein substructure consists of 
three types of globular domains, diad (D), triad (T), 
and ring region (R), connected by narrow bridges. 
These may act as “hinges,” allowing the S layer to form 
a curved surface (lobes) and to follow the movements 


Sulfolobus, 


Metallosphaera, 
Acidianus: 
p3, ~ 21 nm 


ignicoccus: 
no S-layer 


Pyrolobus. 
p4, 18.5 nm 
Pyrodictium: 
p6, 21 nm 


Thermoproteus, 
Pyrobaculum: 
p6, 30 nm 


Pyrococcus, 


Crenarchaeota 


Nanoarchaeum. 
p6, 16 nm 


Thermococcus: 
p6, 15-20 nm 


Methanococcus: 
p6, 10-11.5 nm 


of the cell surface during growth. Note that the larger 
pores of the S layer have the same diameter as pili 
(150). Pili have been observed that attach the cells to 
sulfur crystals (150). Similar results were subsequently 
obtained for the S layer of S. solfataricus (119). 

The interpretation that the S layers of Sulfolobus 
species had sixfold symmetry was later shown to be 
incorrect (6; 91, 92). By improving the technique of 
sample preparation, image recording using cryoelec- 
tron microscopy, and refining the image-processing 
methods of the electron micrographs (including image 
classification), it became clear that the S layer of 
S. acidocaldarius had “only” threefold symmetry 
(92). The S layer of S. shibatae was also found to have 
p3 symmetry and the same fine structure (distribution 
of protein mass) as determined for S. acidocaldarius, 
with a mosaic arrangement of the protein complexes 
and with the frequent occurrence of twin boundaries 
(where neighboring unit cells are observed to be ro- 
tated by 120 degrees) and distortions (where unit cells 
on one lattice line are slightly displaced and rotated to 
each other) (91). 

It has now been established that the S layers of 
the phylogenetically related organisms (Fig. 3), S. aci- 
docaldarius, S. solfataricus, S. shibatae, Acidianus in- 
fernus (Fig. 4a; R. Rachel and H. Huber, unpublished 


Methanothermus: 

ps, 20 nm 

Archaeoglobus: 

p6, 17.5 nm 

Halobacterium, 

Haloferax. p6, 18 nm 
Methanoplanus: pô, 14.7 nm 
Methanospirillum: p2 (sheath) 


Euryarchaeota 


Figure 3. Phylogenetic tree of Archaea. The relative phylogenetic positions of the 16S rDNA sequences of archaea (Crenarchaeota, 
Euryarchaeota, and Nanoarcheum) discussed in this chapter are depicted in the tree. The arrangement and unit cell dimensions 


of S-layer subunits are shown. 
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Figure 4. Transmission electron micrographs of different Archaea. (a-c) Crenarchaeota; (d, e) Euryarchaeota: Thermococcales. 
All cells prepared by freeze etching. (a) Acidianus infernus: p3 symmetry; (b) Aeropyrum pernix: p4 symmetry; (c) Thermo- 
proteus tenax: p6 symmetry; (d) Thermococcus chitonophagus; (e) Thermococcus acidaminovorans. All bars = 1 pm. 


results), Acidianus brierlyi (8), Metallosphaera sedula 
and Metallosphaera prunae (45) all have a high degree 
of similarity to each other. Their fine structure and 
mass distribution, the unique p3 symmetry, the center- 
to-center distance of about 21 nm, and the width of 
their periplasm (about 25 nm) are almost identical to 
each other (6). These structural characteristics appear 
to be common to many, if not all, species belonging 
to the crenarchaeal order Sulfolobales. 


The putative S-layer gene of S. acidocaldarius 
DSM 639 was identified in the recently published 
whole genome sequence (25). The protein Slpsa1 con- 
sists of 1,424 amino acids, corresponding to a mole- 
cular mass of 151,040 daltons (Table 2). A putative 
leader peptide of 29 amino acids would be cleaved 
after membrane translocation to obtain the mature 
protein. Cysteine is present as found for other hyper- 
thermophilic S-layer (glyco)proteins. 


OTE 


Table 2. Putative S-layer glycoproteins and membrane glycoproteins in three Sulfolobus species 


Accession no. Size MW MW (kDa) Leader PAS staining 


onan serra Anoan (NCBI) (aa) (Da)? SDS-PAGE peptide! (aa) pl? (N-glycosylation sites)? 
S. acidocaldarius LTPEVSAGGIQAYLL S-layer glycoprotein CAJ30479 1,424 151,040 150° 29 5.17 + (29) 
DSM 639 (experimental) 
TVPVILYAPFIF/ Slp 1 
S. solfataricus P2 AATYTIPSVT S-layer glycoprotein CAJ31324 1,231 131,868 130 30 6.56 + (37) 
(experimental) 
Slp 18 
S. tokodaii strain 7 n.d. S-layer glycoprotein BAB67302 1,440 156,240 n.d. 28 4.58 n.d (40) 
(putative) 
Slp 1” 
S. acidocaldarius AVTINGITFYSPV Membrane protein YP_256928 475 49,546 60 24 9.47 + (13) 
DSM 639 
S. solfataricus P2 ISKTLVAVIIVVVI Maltose ABC 
transporter protein NP_342629 523 56,856 60 - 5.98 + (10) 


“Determined by Edman degradation (B. Schlott, IMB, Jena). 

’Matrix-assisted laser desorption/ionization (MALDI) determination: 176,974 Da. 
“Determined with Signal P. 

4Determined with Protparam. 

“Determined with Prosite. 

“Internal fragment obtained with proteinase K. 

sSequence alignments (Blast 2) with the S-layer glycoprotein Slp 1 show 25% identical residues. 
’Sequence alignments (Blast 2) with the S-layer glycoprotein Slp 1 show 21% identical residues. 
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Desulfurococcales 


For D. mobilis, the S layer was shown to exhibit 
p4 symmetry and a rather open meshwork of protein, 
with a lattice of 18 nm (158). Two other species in 
this order also have an S layer with p4 symmetry, al- 
though with a different fine structure. P. fumarii, an 
archaeon, which grows up to 113°C, has an S layer 
that encloses a 40-nm-wide “quasi-periplasmic space” 
and has a lattice of 18.5 nm (14). S. marinus has 
a unique glycoprotein S layer, named “tetrabra- 
chion,” which was intensively studied by biochemical 
methods, gene sequencing, and electron microscopy 
(115, 116). The morphological subunits of the S layer 
appear as a branched filiform meshwork and form a 
canopy with a distance of about 70 nm from the cell 
membrane, which encloses a “quasi-periplasmic 
space” (94). A morphological subunit is a tetramer 
of polypeptides (M,, 92 kDa) that form a parallel, 
four-stranded a-helical rod, 70 nm long. It separates 
at one end into four strands, or straight arms (hence 
the name “tetra-brachion,” “four arms”). Each 
polypeptide arm (M,, 85 kDa; 24 nm in length) is 
composed of B-sheets and provides lateral connectiv- 
ity to neighboring morphological subunits by end-to- 
end contacts (115). Attached to the middle of each 
stalk are two copies of a protease that provide an ex- 
odigestive function related to the heterotrophic en- 
ergy metabolism of the organism (94). The tetrabra- 
chion structure (that includes the protease) exhibits 
an unusual thermal stability (115, 116). 

Recently, freeze etching was used to investigate 
the fine structure of cells of Aeropyrum pernix, a 
closely related Aeropyrum isolate, and three other iso- 
lates (CB9, HVE1, CH11), which were obtained from 
hot springs (R. Rachel, I. Wyschkony, H. Schmidt, 
and H. Huber; unpublished results). The physiological 
characterization of the three new isolates is incom- 
plete. According to their 16S rRNA gene sequence, 
they all belong to the Desulfurococcales. Electron mi- 
crographs of freeze-etched cells (Fig. 4b) revealed that 
they have S layers with p4 symmetry and a lattice of 
about 18 to 19 nm, with obvious differences in their 
surface relief and in the mass distribution of the pro- 
tein complexes. An S layer with p4 symmetry, an open 
network of protein, and a comparably large quasi- 
periplasmic space (30 to 70 nm) is a common feature 
of these organisms and of three others, Desulfuro- 
coccus mobilis, Staphylothermus marinus, and Py- 
rolobus fumarii. 

In contrast to the Aeropyrum-related strains, 
cells of the genera Hyperthermus and Pyrodictium 
have S layers with p6 symmetry. For all species of the 
genus Pyrodictium that have been isolated, P. occul- 
tum, P. abyssi, P. abyssi strain TAG11, and P. brockii, 


the center-to-center distance of the S layer is about 
21 nm, and their mass distribution, surface reliefs, 
and 3D structures are almost identical (37, 51, 121). 
However, distribution of protein mass and surface re- 
lief of the H. butylicus S layer is clearly different from 
that of the Pyrodictium species (7); also, its center- 
to-center distance of 25 nm is significantly larger than 
in Pyrodictium. This shows that the phylogeny in this 
family, as determined by the sequence of the 16S 
rDNA, correlates with S-layer ultrastructure. 


Thermoproteales 


T. tenax (Fig. 4c) and Thermofilum pendens pos- 
sess extraordinarily rigid S-layer sacculi of hexago- 
nally arranged subunits that are resistant to detergent 
and protease treatment (166, 167). The S layer could 
be isolated by disrupting the cells by using sonication 
followed by incubation with DNase and RNase, SDS 
treatment (2% SDS, 100°C, 30 min), and differential 
centrifugation (81, 157). In T. tenax, the 25-nm-wide 
interspace is due to long protrusions that extend from 
the relatively thin (3 to 4 nm) filamentous network 
of the outer surface of the S layer toward the cyto- 
plasmic membrane. The distal ends of these pillarlike 
protrusions appear to penetrate the membrane, thus 
serving as membrane anchors (99, 157). 

The S layers of at least three species belong- 
ing to the genus Pyrobaculum, P. islandicum (117), 
P. organotrophum, and P. aerophilum (146), all have 
the same characteristic S-layer structure, which is al- 
most identical to the S layer of Thermoproteus. This 
consists of a delicate protein meshwork, with long 
protrusions serving as membrane anchors, that en- 
closes a 25-nm-wide periplasm, a center-to-center dis- 
tance of about 30 nm, and a common chemical and 
thermal stability. Preliminary investigations with cells 
of two Thermofilum species showed a similar S-layer 
fine structure, with p6 symmetry and a center-to-center 
distance of 27 nm (R. Rachel, unpublished results). 


Euryarchaeota 
Methanococcales 


The S layers of various genera of this order do 
not show a high level of detail when examined by 
freeze etching, indicating that they do not have elab- 
orate surface relief (Fig. 5). The only common fea- 
ture presently identified is the center-to-center dis- 
tance of only 11 nm (85, 109) (K. Schuster and 
R. Rachel, unpublished results). The periplasmic space 
is narrow, with a width about 5 to 10 nm (20). 
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Figure 5. Transmission electron micrographs of species of the Methanococcales. (a) M. thermolithotrophicus; (b) M. jannaschii; 


freeze-etching. Bar = 0.5 ym. 


Thermococcales 


Similar to the Methanococcales, only limited in- 
formation is available on the 3D structure of these 
S layers (Fig. 4d, e). The S layers of two genera of 
this order, Thermococcus and Pyrococcus, could not 
be isolated successfully by detergent extraction. In 
addition, they did not show much detail when exam- 
ined by freeze etching because their surface relief is 
shallow (36, 58, 71). The only common feature is a 
lattice constant of ~15 to 20 nm (9, 84). The width 
of the periplasmic space in thin sectioned cells pre- 
pared by quick freezing and freeze substitution is 
about 10 nm (58). 


Archaeoglobales 


Cells of two Archaeoglobus species have been 
investigated. A. fulgidus (73) and A. veneficus (57) 
have an S layer with a center-to-center distance of 
18 nm. When the 3D structure of the A. fulgidus S 
layer was successfully determined by electron crys- 
tallography, it showed a remarkable structural insta- 
bility. It could not be isolated by detergent extraction 
of whole cells, as the S layers of many Crenarchaeota, 
but it could only be imaged after absorbing the cells 
to the carbon support and on-grid detergent extrac- 
tion for a few seconds only (73). Based on its 16S rRNA 
gene sequence and physiology, F placidus is related to 
the genus Archaeoglobus. In contrast to Archaeo- 
globus, F. placidus exhibits a different type of S layer, 
with a square lattice and a spacing of 23 nm (48). 


Thermoplasmatales 


Species of the genera Thermoplasma and Ferro- 
plasma do not show any sign of an S layer. The only 
known genus of this order that exhibits an S layer is 
Picrophilus, which is known for its ability to grow at 
pH 0. The S layer has p4 symmetry and a center-to- 
center distance of 20 nm. The width of the periplas- 
mic space is about 40 nm (125). 


General remarks 


In summary, there are many ultrastructural fea- 
tures of S layers that characterize the (hyper-) ther- 
mophiles. Within the Crenarchaeota that have been 
investigated, most S layers are stable and robust. 
They can be isolated by detergent extraction of whole 
cells or cell envelopes, indicating that the interaction 
between the protein complexes is strong. The distance 
between the cytoplasmic membrane and S layer is 
wide and varies between 20 and 25 nm (Sulfolobales, 
Thermoproteales), 40 nm (e.g., Pyrolobus), and 70 nm 
(Staphylothermus). In the S-layer-deficient Ignicoccus, 
the periplasm has been observed to vary between 20 
and 500 nm in a single cell (120) (see “The outer 
membrane of Ignicoccus and the surface layer of 
Nanoarchaeum,” below). These features are in con- 
trast to many species that belong to the Euryarch- 
aeota. Intact S-layer sheets of the Euryarchaeota tend 
to be difficult to obtain because the S layers are la- 
bile and can not be isolated by detergent extraction. 
The width of the periplasm in most cells is 15 nm or 
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less and tends to be in the range 5 to 10 nm. The close 
spatial association between membrane and S layer 
correlates with the fact that their physicochemical in- 
teraction appears to be tight, while the interaction be- 
tween the protein complexes is weak; detergent ex- 
traction of the membrane lipids results in dissolution 
of the S layer. There are notable exceptions within the 
Euryarchaeota: (i) Picrophilus exhibits an S layer that 
is strong enough to withstand detergent extraction, 
and the periplasm is 40 nm wide; (ii) in addition to an 
S layer, all species of the genera Methanothermus and 
Methanopyrus have a pseudomurein sacculus as a 
second cell wall polymer that is highly stable and 
serves to maintain cell shape maintenance and may 
protect the cells. 

The characteristics of the structures of S layers 
are consistent with the primary structures of S-layer 
genes (2, 26). These studies revealed significant simi- 
larities of S-layer genes between closely related species 
of Euryarchaeota, including genes from members of 
the orders Methanococcales, Thermococcales, and 
Halobacteriales. 


PROTEINACEOUS SHEATHS COVER 
THE CELLS OF METHANOSPIRILLUM 
AND METHANOSAETA 


The filamentous chains of Methanospirillum 
hungatei and Methanosaeta concilii (formerly Metha- 
nothrix soehngenii) (111) are held together by a pro- 
teinaceous fibrillary sheath (163). Each single cell of 
Methanospirillum is surrounded by a separate elec- 
tron-dense layer. Standard techniques for cell wall iso- 
lation provide pure preparations of sheath material 
(62). Freeze-etched specimens showed each fibril to 
be composed of two subfibrils. The isolated sheath 
material is resistant to detergent, chaotropic agents, 
and common proteases. Isolated sheaths (62, 132) are 
composed of amino acids and neutral sugars. Com- 
puter image processing of tilted-view electron micro- 
graphs of isolated, negatively stained sheaths revealed 
a two-dimensional S-layer-like paracrystalline struc- 
ture. The structure consists of subunits with P1 sym- 
metry with cellular dimensions of a = 12.0 nm, B = 
2.9 nm, and y = 93.7° (126), or subunits are 
arranged on a lattice with P2 symmetry and cells with 
a = 5.66 nm, B = 2.81 nm, and y = 85.6° (135). 
Treatment of isolated sheaths at 90°C with a combi- 
nation of B-mercaptoethanol and sodium dodecy] sul- 
fate under alkaline conditions results in the solubi- 
lization of “glue peptides” and the liberation of single 
hoops, the essential structural component of the 
cylindrical sheaths (133). Imaging the inner and outer 
surfaces of isolated sheaths with a bimorph scanning 


tunneling electron microscope confirmed that the 
sheaths form a paracrystalline structure and that they 
consist of a series of stacked hoops of ca. 2.5 nm in 
width (12). This study also showed that the sheath 
possesses minute pores and therefore is impervious 
to solutes with a hydrated radius of >0.3 nm. 

In cross sections of M. concilii, the envelope ap- 
pears as a double track about 25 to 30 nm in width, 
with a very dark inner and a more electron-transpar- 
ent outer layer. Only the inner layer participates in 
septum formation during cell division. Hence, it may 
be assumed that only the outer, electron-transparent 
layer represents a striated sheath that embraces many 
cells, whereas the inner layer represents a rigid cell 
wall sacculus surrounding individual cells (59, 162). 
They also show striation not only at the cylindric part 
of the sacculi, but also at the septa. This indicates that 
both layers seen in cross sections of whole cells be- 
long to the same morphological entity, which there- 
fore may not fit the definition of a sheath in the strict 
sense of the word. Chemical analysis of isolated en- 
velopes revealed a complex amino acid pattern and 
the presence of neutral sugars (mainly glucose, man- 
nose), resembling the composition of the sheath of 
M. hungatei (62, 132). After hydrozinolysis of sheath 
preparations from M. concilii strain FE, an asparagine- 
rich glycoprotein fraction was obtained (114). This 
is indicative of the presence of asparaginylrhamnose 
linkages on Asn-X-Ser glycosylation sites in the 
sheath glycoprotein. 


PROTEIN SUBUNITS FORM THE CELL 
WALL OF METHANOCOCCI 


All species belonging to the order Methanococ- 
cales possess hexagonal S layers as exclusive cell wall 
components outside the cytoplasma membrane (Fig. 
5). As the Methanococcales include mesophilic, ther- 
mophilic, and extreme thermophilic species, they rep- 
resent an ideal model system for studying thermal 
adaptation of S layers. The special features of these 
S layers are described below with reference to other 
bacterial and archaeal S-layer proteins (2, 5, 26, 77, 
79, 109). 


Gene Sequences 


The first Methanococcales gene sequence for an 
S-layer protein was for the mesophile, Methanococ- 
cus voltae (82). Additional putative genes for S-layer 
proteins have been identified in the complete genome 
sequences of the mesophile Methanococcus mari- 
paludis (165) and the hyperthermophile, Methano- 
caldococcus jannaschii (19). Based on the conserved 
N-terminal region, primers were constructed for PCR 
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amplification and sequencing of previously unidenti- 
fied S-layer genes from the mesophile Methanococ- 
cus vannielli, the thermophile Methanothermococcus 
thermolithotrophicus, and the hyperthermophile 
Methanotorris igneus (Table 3). The genes were con- 
firmed to be S-layer genes by purification of the pro- 
teins they encoded and determination of their N ter- 
mini by Edman degradation (1, 2). 


Signal sequences for secretion 


Most S-layer proteins contain N-terminal signal 
sequences that allow their secretion across the cyto- 
plasmic membrane by the general secretory pathway 
(15, 42, 123) (See Chapter 17). The putative 28- 
amino-acid leader peptides of proteins from the 
Methanococcales (Fig. 6) are highly conserved (1, 2). 
They display typical characteristics of a signal se- 
quence with a nonhydrophobic (n)-region, a hy- 
drophobic (h)-core, and a charged (c)-region with an 
alanine residue at the peptide cleavage site (10). 

The 22- and 30-amino-acid-long signal peptides 
of M. fervidus (17) and M. mazei (161), respectively, 
are not homologous with signal peptides of the 
Methanococcales. In addition to S-layer proteins, ar- 
chaeal flagellins possess leader peptides (3, 5, 29). 
However, these short positively charged leader pep- 
tides consist of only 4 to 14 (in many cases 12) amino 
acids, with an invariant glycine at the cleavage site, 
and have little similarity to the signal peptides of 
S-layer proteins. 

Several bacteria produce excess amounts of 
S-layer proteins to ensure complete coverage of the 
cells during all phases of growth. The thermophilic 
bacterium, Geobacillus stearothermophilus, appears 
to produce an S-layer protein pool in the peptidogly- 
can layer (16). Several bacteria have been reported to 
shed S-layer material into the culture medium (15). 
For the bacterium, Acinetobacter, this occurs as a re- 
sult of an overproduction of new S-layer protein 
(143). Considerable quantities of S-layer proteins are 
released into the culture medium from M. vannielii, 
M. thermolithotrophicus, and M. jannaschii (between 


about 14% and 50% of the total S-layer protein). For 
M. thermolithotrophicus this commences early and 
continues throughout the growth phase (Fig. 7). A 
similar pattern was observed with the other two 
methanogens (data not shown). 


Regulatory sequences controlling high-level 
gene expression 


Analysis of the codon usage, ribosome-binding 
sites, and transcriptional promoters of S-layer genes 
indicates they are highly expressed (15). This is con- 
sistent with fast regeneration after S layers have been 
extracted by nonlethal methods. M. voltae proto- 
plasts completely regenerated S layers within 60 min, 
although the mean generation time of growing cells 
is 16 h (43, 110). Large quantities of sheetlike S-layer 
patches were observed in the regeneration medium. 
Because of their efficient expression S-layer genes 
might be suitable for constructing vectors for heterol- 
ogous overproduction of proteins. 

Regulatory sequences involved in transcription 
(see Chapter 6) and translation (see Chapter 8) were 
examined in S-layer genes from members of the four 
genera, Methanococcus, Methanothermococcus, Meth- 
anocaldococcus, and Methanotorris (Table 4). For 
M. jannaschii genes, a TATA box and the “factor B 
recognition element” (BRE) are located 19 and 29 
nucleotides upstream from the transcription start, re- 
spectively. The translation start and the Shine-Dal- 
garno sequence are located downstream from the 
transcription start site beginning at nucleotides 45 and 
33, respectively. The TATA box perfectly matches the 
consensus sequence for methanogenic Archaea (142). 

In contrast to M. jannaschii, several promotors 
have been described for the S-layer gene of M. voltae 
(65). The ribosome-binding site, usually localized 3 to 
9 nucleotides upstream of the translation start site 
(30), was found to be complementary to a region at 
the 3’ terminus of the 16 S rRNA of M. jannaschii. 
Translation is terminated with a series of stop codons, 
which is a common feature of methanogenic Archaea 
(30). They are followed by poly(A)/poly(T) sequence, 


Table 3. S-layer genes from mesophilic and (hyper)thermophilic Methanococcales 


Species Opt. growth temp. (°C) Gene Nucleotide accession no. Protein accession no. 
Methanotorris igneus 88 slmi 1 AJ564995¢ Q6KEQ4? 
Methanocaldococcus jannaschii 85 slmj 1 AJ311636¢ Q58232" 
Methanothermococcus thermolithotrophicus 65 slmt 1 AJ308554" Q8X235 
Methanococcus vannielii 37 slmv 1 AJ3085537 Q8X234° 
Methanococcus voltae 37 sla M59200% Q50833? 
Methanococcus maripaludis 37 slp NC005791" NP987503° 


“EMBL Database. 
>UniProt/Swiss Prot. 
“GenBank/NCBI. 
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Figure 6. Comparison of the leader peptides of the S-layer proteins 
of Methanococcales. (1) M. igneus; (2) M. jannaschii; (3) M. thermo- 
lithotrophicus; (4) M. vannielii; (5) M. voltae; (6) M. maripaludis. 


which probably leads to formation of a hairpin struc- 
ture and terminates transcription. Similar promoter 
regions were suggested for the other Methanococcales; 
however, it seems they are located at different posi- 
tions within the gene sequence (Table 4). 


Amino Acid Composition and Primary Structure 


Molecular characteristics deduced from the gene 
sequences of S-layer proteins from Methanococcales 
are compiled in Table 5. They are all slightly acidic 
proteins with molecular masses ranging from 56 to 
61 kDa. 


Glycosylation 


Thermal stabilization of S-layer proteins may be 
attributed to posttranslational modifications (e.g., 
glycosylation, phosphorylation), covalent cross-link- 
ing, or salt bridging (15, 40) (see Chapter 11). Gly- 
cosylation of S-layer proteins is in general well char- 
acterized (101), e.g., for the hyperthermophilic 
methanogenic species M. fervidus and Methanother- 
mus sociabilis (17, 66, 108). 
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Figure 7. Distribution of S-layer protein in cells and in the culture 
medium during growth of M. thermolithotrophicus. 


Table 4. Putative regulatory sequences for S-layer genes (25) 


Region Sequence Position? 

Promoter 
a. BRE box -CGAAA-° 

M. jannaschii: -CGTAA- —33 to -29 

M. igneus -CGTAA- —35 to-31 

M. thermolithotrophicus -CGTAA- —35 to -31 

M. vannielii -CGTAA- —40 to -36 

M. voltae I -CGTAA- —34 to -30 

M. voltae II -CGTAA- —33 to -29 
b. TATA box -AA/TTTATATA-® 

M. jannaschii -TTTATATA- —26 to-19 

M. igneus -TTTATATA- —28 to -29 

M. thermolithotrophicus -TATATATA- —28 to -21 

M. vannielii -TATAATAA- —32 to -25 

M. voltae I -TATATATA- —27 to -20 

M. voltae II -AATAAAA- —26 to -19 
c. Transcription start -A/TTGC-°? 

M. jannaschii -ATAC- 1 

M. igneus -ATCG- 1 

M. thermolithotrophicus -ATCC- 1 

M. vannielii -ATAC- 1 

M. voltae I -ATTT- 1 

M. voltae II -ATAC- 1 
Shine-Dalgarno sequence 

M. jannaschii -AGGTGAT- 33-39 

M. igneus -AGGTGAT- 37—43 

M. thermolithotrophicus -AGGGTGA- 64-70 

M. vannielii -AGGTGAA- 60-66 

M. voltae I -AGGTGAT- 444—450 

M. voltae II -AGGTGAT- 204-210 
Translation start 

M. jannaschii -ATG- 45 

M. igneus -ATG- 49 

M. thermolithotrophicus -ATG- 77 

M. vannielii -ATG- 72 

M. voltae I -ATG- 456 

M. voltae II -ATG- 216 


Position relative to the transcription start. 

’Consensus sequence. 

‘Full name of organisms: Methanocaldococcus jannaschii, Methanotorris 
igneus, Methanothermococcus thermolithotrophicus, Methanococcus van- 
nielii, and Methanococcus voltae. 


A larger number of N-glycan sites are predicted 
in the primary amino acid sequences of S-layer proteins 
from the hyperthermophilic M. jannaschii compared 
with their mesophilic relatives (2, 26, 27, 77). The 
same was found for the S-layer protein of the hyper- 
thermophilic M. igneus (1, 25), suggesting a role for 
glycosylation in the thermostabilization of these pro- 
teins. Although conventional staining methods for the 
detection of glycoproteins (periodic acid-Schiff [PAS]) 
were negative, positive signals were obtained with 
more sensitive immuno blots (Fig. 8). In addition the 
S-layer proteins of Methanococcales revealed appar- 
ent higher molecular masses in SDS-polyacrylamide gel 
electrophoresis (PAGE) than expected from their gene 
sequences. Additional indirect evidence for posttrans- 
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Table 5. S-layer proteins from selected mesophilic and (hyper)thermophilic Methanococcales 


; Molecular : 
a. Size Isoelectric 
Species“ (aal mass boit 
(daltons) 
M. igneus 519 55,669 4.68 
M. jannaschii 558 60,547 4.27 
M. thermolithotrophicus 559 59,225 4.30 
M. vannielii 566 59,064 4.29 
M. voltae 565 59,707 4.15 
M. maripaludis 575 58,948 3.90 


N- 


. Cys Ala Asp+Glu Arg+Lys 
Glycosylation co) (mol %)  (mol%) (mol %) 
sites 

8 0.4 9.4 15.4 11.2 

8 0.4 9.9 20.4 10.8 

5 - 12.2 18.2 10.2 

- - 16.3 14.3 8.1 

2 - 14.0 18.6 8.7 

- 17.6 16.9 5.7 


*Full name of organisms: Methanocaldococcus jannaschii, Methanotorris igneus, Methanothermococcus thermolithotrophicus, Methanococcus vannielii, 


Methanococcus voltae, and Methanococcus maripaludis. 


lational modification is indicated by the smaller size 
of the S-layer protein from M. jannaschii when het- 
erologously expressed in Escherichia coli (Fig. 9). 

Recently, it has been demonstrated that flagellin 
proteins, and probably the S-layer protein of M. voltae, 
are glycosylated. The glycan structure of the flagellin 
elucidated by NMR analysis was shown to be a novel 
trisaccharide composed of B-ManpNAcA6Thr-(1-4)-B- 
GlcpNAc3NacA-(1-3)-B-GlcpNac linked to Asn (145). 
This low degree of glycosylation might not be detected 
by less sensitive glycoprotein-staining methods. 


M 1 2 3 kDa 


<< 80 
<< 60 


Figure 8. Immunoblot for glycoprotein detection. M, Marker 
proteins (negative control); S-layer proteins of (1) M. thermo- 
lithotrophicus (80 kDa); (2) M. vannielii (60 kDa); (3) M. jan- 
naschii (80 kDa protein); S-layer proteins were electroeluted from 
SDS polyacrylamide gels and 5 wg applied to the gel for subsequent 
immunoblotting. 


Thermostabilization 


In addition to posttranslational modifications, in- 
trinsic features of the polypeptide chain contribute to 
thermostabilization of proteins. Sequences of 115 pro- 
teins (S-layer proteins were not included) from M. jan- 
naschii were compared with homologs from meso- 
philic Methanococcales (49). Characteristics of the 
proteins from thermophiles included higher residue 
volume and hydrophobicity, a higher percentage of 
charged amino acids (especially Glu, Lys, and Arg), 
and a lower percentage of uncharged polar residues 
(Ser, Thr, Asn, and Gln). In a similar study a large num- 
ber of proteins from mesophilic and thermophilic to 
extreme thermophilic Bacillus and Methanococcales 


native Marker recombinant kDa 
— SUP 150 
a 100 


— 40 


Figure 9. Ten percent SDS-PAGE of native and recombinant S-layer 
protein of M. jannaschii. 
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species were compared (95). An increase of Ile, Glu, 
Lys, and Arg and a decrease in Met, Asn, Gln, Ser, and 
Thr was observed in the thermophilic methanococcal 
proteins. In recent studies the complete genome se- 
quences of mesophiles and thermophiles were analyzed 
(21). A large difference between the proportions of 
charged versus polar (noncharged) amino acids was 
found to be a common signature of all hyperther- 
mophilic organisms. Ionic interactions may provide a 
mechanism for thermostabilization (49), and the pro- 
portional increase of oppositely charged residues in hy- 
perthermophiles may provide a thermodynamic ad- 
vantage due to the increased stability of coulombic 
interactions with temperature (21). Electrostatic inter- 
actions are important for S-layer stability in extreme 
halophiles (137) as well as in several bacteria (122, 123, 
128). Sequences of putative soluble proteins from com- 
plete genomes of 8 thermophiles and 12 mesophiles 
were analyzed to gain insight into determinants of pro- 
tein thermostability.The stabilizing factors include re- 
duced protein size, increases in number of residues in- 
volved in hydrogen bonding, B-strand content, and 
helix stabilization through ion pairs. There are also sig- 
nificant increases in the relative amounts of charged 
and hydrophobic B-branched amino acids and de- 
creases in uncharged polar amino acids in proteins 
from thermophiles relative to mesophilic organisms. 
Factors such as the relative proportion of residues in 
loops, proline and glycine content, and helix capping 
do not appear to be important (22). 

Similarly, the histones from mesophilic (Methan- 
obacterium formicicum), thermophilic (Methano- 
thermobacter thermautotrophicus), and hyperther- 
mophilic (M. fervidus) archaea, which have similar 
amino acid sequences but very different thermody- 
namic stabilities (93, 140), are believed to be stabi- 
lized by buried intramolecular arginine-aspartate in- 
teractions and intramolecular salt bridges on the 
surface of histone dimers. These structural features 


are also present in S-layer proteins of Methanococ- 
cales. The S-layer proteins of the thermophilic and 
hyperthermophilic methanococci exhibited an in- 
crease in basic amino acids and a reduction of some 
amino acids with nonpolar side chains, e.g., alanine 
compared with their mesophilic counterparts (Table 5). 
The overall hydrophobicity is higher for the S-layer 
proteins from the mesophilic strains, indicating that it 
does not play a major role for adaptation to high tem- 
peratures in Methanococcales. As shown in Fig. 10, 
hydrophilic residues dominate in the S-layer protein of 
M. jannaschii, except in the N-terminal region, where 
hydrophobic residues may be involved in transmem- 
brane transport or binding. 

An increase of solvent-accessible surfaces in pro- 
teins of hyperthermophiles has been described (21). 
Thus, the increase in charged amino acids, especially 
lysine, in the S-layer proteins of M. thermolitho- 
trophicus, M. jannaschii, and M. igneus could con- 
tribute to their increased thermal stability (2, 26). An 
increase in charged residues is present in the S-layer 
proteins of M. mazei (mesophilic) < M. thermauto- 
trophicus (thermophilic) < M. fervidus (hyperther- 
mophilic) and A. fulgidus (hyperthermophilic). The 
S-layer glycoprotein of M. fervidus contains high 
amounts of Asn and a basic isoelectric point (17). The 
relevance for the high amount of Asn is unknown. 

Another significant feature of the S-layer pro- 
teins from the hyperthermophiles M. jannaschii and 
M. igneus is the occurrence of cysteine, which has 
only been detected in a few S-layer proteins (123, 
128, 129). Intramolecular disulfide bridges may be 
another factor involved in the thermal stability of this 
surface protein. 


Secondary Structures 


A higher content of ordered structures (e.g., he- 
lical conformations), and fewer loops are predicted in 


Figure 10. Hydropathy profile of the S-layer protein of M. jannaschii. 
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S-layer proteins from the mesophilic M. voltae and 
M. vanniellii compared with M. thermolithotrophicus 
and M. jannaschii (Table 6). In contrast to most com- 
mon conceptions of factors determining protein 
thermostability, the relatively low extent of ordered 
secondary structures in the S-layer proteins of the hy- 
perthermophilic members of the Methanococcales 
suggests that they are flexible molecules. This might 
explain their unusual behavior in SDS-PAGE, where 
they often appear in several conformational states 
(25). The migration of the purified S-layer protein 
from M. jannaschii varied in response to cations, pH, 
and temperature (Fig. 11). A similar temperature-de- 
pendent electrophoretic migration has been observed 
for the S-layer protein of the hyperthermophile 
Nanoarchaeum equitans (Schuster and Rachel, un- 
published). Whether conformational adaptations of 
S-layer proteins may occur in vivo on the cell surface 
of Archaea living under extreme thermophilic condi- 
tions has to be investigated. 


Sequence Comparison 


In general, minor structural differences were ob- 
served in the primary and secondary structures of the 
S-layer proteins of mesophilic and hyperthermophilic 
Methanococcales. Most striking differences were 
found with respect to the occurrence of cysteine, the 
amount of basic amino acids residues, and the degree 
of hydrophobicity. 

Apart from the leader peptides, sequence align- 
ments revealed a notable degree of homology be- 
tween the S-layer proteins of the mesophilic up to the 
(hyper)thermophilic Methanococcales, especially at 
the N and C termini (Table 7). The S-layer genes of 
the Methanococcales shared a significant homology 
with the presumptive S-layer genes of the hyperther- 
mophilic heterotrophs Pyrocococcus abyssi and Py- 
rococcus horikoshii. No significant similarity was 
found to any other archaeal S-layer protein. 


Table 6. Secondary structures 


Predicted structural features? 


Species? 
Helices Sheets Loops 
M. voltae 36 27 36 
M. vannielii 45 19 36 
M. thermolithotrophicus 27 28 46 
M. jannaschii 22 25 51 


“Predicted by the PHD program, represented at percentage of total structural 
features. 

Full name of organisms: Methanocaldococcus jannaschii, Methanothermo- 
coccus thermolithotrophicus, Methanococcus vannielii, and Methanococcus 
voltae. 


Another group is formed by the S-layer proteins 
of Methanosarcina mazei and the gram-positive 
methanogens M. thermautotrophicus, M. fervidus, 
and M. sociabilis, which possess a significant degree 
of similarity to each other and to the sulfate-reduc- 
ing hyperthermophilic A. fulgidus. The S-layer pro- 
teins of halobacteria and of the hyperthermophilic 
crenarchaeon S. marinus shared no homologies with 
the methanogens (26, 27). 


TWO-LAYERED CELL WALLS CHARACTERIZE 
THE HYPERTHERMOPHILES 
METHANOTHERMUS AND METHANOPYRUS 


The hyperthermophilic methanogens M. fervidus 
(134) and Methanopyrus kandleri (87) have double- 
layered cell envelopes that are recognizable in ultra- 
thin sections (134). The pseudomurein has a thick- 
ness of ~15 to 20 nm and is covered by an external 
protein layer. This layer exhibits a hexagonal pattern 
with a center-to-center spacing of about 20 nm in 
freeze-etched cells of M. fervidus (Fig. 12a, b). In con- 
trast, a crystalline arrangement of the outermost layer 
has not been observed for Methanopyrus cells (Fig. 
12c, d). The S-layer glycoprotein of M. fervidus can 
be extracted with trichloroacetic acid and purified by 
reverse-phase chromatography using aqueous formic 
acid as eluant (17, 66, 108). 

The mature protein consists of 593 amino acids. 
Compared with mesophilic S-layer proteins, it has a 
significantly higher content of isoleucine, asparagines, 
and cysteine, and a 14% higher content of B-sheet 
structure. The glycoprotein also possesses a leader 
peptide and 20 sequon structures (e.g., asparagine- 
X-serine/threonine) as potential N-glycosylation sites 
(17). One type of oligosaccharide is present and con- 
sists of D-3-O-MetMan, D-Man, and d-GalNAc (66). 
It is linked via N-acetylgalactosamine to asparagine 
residues of the peptide moiety (Fig. 13). In the biosyn- 
thesis of the glycan chains, nucleotide-activated 
oligosaccharides seem to be involved (50). 


SULFATED PROTEOGLYCAN-LIKE S LAYERS 
IN NEUTROPHILIC HALOPHILES 


The ultrastructure of the S layers of two genera 
of the order Halobacteriales, Halobacterium and 
Haloferax, has been studied in great detail (72, 74, 
144). Members of both genera require similarly ex- 
treme osmotic conditions for growth. The surface gly- 
coprotein of H. salinarum (former name: H. halo- 
bium [139]) was the first glycosylated protein detected 
in bacteria and archaea (97, 98). The S layers of both 
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Figure 11. SDS-PAGE migration of purified S-layer protein from M. jannaschii. SDS-PAGE performed after heat treatment at 


different temperatures for 5 min in denaturating buffer. 


microorganisms completely cover the cells. At present 
levels of resolution (2 nm in projection, ~2.5 to 3 nm 
in three-dimensional reconstructions), they show the 
same symmetry (p6) and center-to-center spacing of 
15 nm and a high degree of similarity in the subunit 
organization of the oligomeric complexes (144). The 
width of the periplasmic space in both cases is about 
10 nm. 

The glycoprotein of H. salinarum possesses a 
stretch of 12 hydrophobic amino acids at the C ter- 
minus, which functions as a membrane anchor. Three 
different saccharide chains are linked to the peptide 
(Fig. 14). A sulfated pentasaccharide-repeating unit 
forms a glycosaminoglycan chain, which is linked to 


Table 7. Sequence homology of selected S-layer proteins? 


M M M M M. P. P. 


Species vol. van. lit. jan. ign. aby. hor. 
1. M. voltae 44 48 38 31 23 28 
2. M. vannielii 47 49 44 31 24 31 
3. M. thermolitho- 50 49 53 40 26 29 

trophicum 

4. M. jannaschii 40 44 53 35 25 33 
5. M. igneus 34 35 43 39 n.d. n.d. 
6. P. abyssi 24 25 26 26 nd. 79 
7. P. horikoshii 27 32 29 29 ñd: 79 


“Alignments of amino acid sequences (BLAST 2.0), represented as percent 
amino acid identity. 1, 2, Methanococcus; 3, Methanothermococcus; 
4, Methanocaldococcus; 5, Methanotorris; 6, 7, Pyrococcus. 


the asparagine residue at the second N-terminal po- 
sition of the peptide chain via an N-acetylgalac- 
tosamine. Ten sulfated glucose, glucuronic acid, and 
iduronic acid containing oligosaccharides are linked 
via glucose to an asparagine residue. In addition to 
the two types of N-linked glycan chains, about 15 O- 
glycosidically linked glucosylgalactose disaccharides 
occur. The disaccharides form a cluster close to the 
transmembrane domain. The glycoprotein of H. sali- 
narum is acidic because of the occurrence of more 
than 20% aspartic and glutamic acid residues and up 
to 50 mol of uronic acids and 50 mol of sulfate 
residues per mol of peptide. 

The mature polypeptide of the S-layer glycopro- 
tein of H. volcanii is composed of 794 amino acids. 
Similar to the surface glycoprotein of H. salinarum, 
a hydrophobic stretch of about 20 amino acids at the 
C terminus probably functions as a transmembrane 
domain. Glucosyl-(1->2)-galactose disaccharides are 
assumed to be linked to threonine residues clustering 
close to the membrane anchor. Sulfated and amino 
sugar-containing oligosaccharides are absent in the 
surface glycoprotein of H. volcanii. Newly synthe- 
sized S-layer glycoproteins of H. volcanii undergo a 
maturation step following translocation of the pro- 
tein across the plasma membrane. The processing 
step is unaffected by inhibition of protein synthesis 
and is apparently unrelated to glycosylation of the 
protein (38). The S-layer proteins of moderate and 
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Figure 12. Transmission electron micrographs of M. fervidus and M. kandleri prepared by freeze etching. (a) Computed dif- 
fractogram, or power spectrum, of the image area marked in b. (b) M. fervidus, original micrograph of a freeze-etched cell. 
(c, d) M. kandleri. (c) Freeze-etched; (d) ultrathin section. All bars = 0.5 um. 


extreme halophilic archaea differ in the composition 
of the glycan chains (96). 

To maintain a rod shape, halobacteria require a 
high concentration of extracellular NaCl (ca. >8% or 
>12%). Mg** ions may form salt bridges between 
sulfate and uronic acid residues of the oligosaccha- 
rides of S-layer glycoproteins of H. salinarum (156), 
because salt bridges are required for maintaining the 
integrity of the S-layer subunits. In the absence of 
Mg?" ions, the S layer will disintegrate. 

The biosynthesis of the glycan of the glycopro- 
tein from H. salinarum has been studied in greater de- 
tail (90, 112, 136, 139, 152-156), while in the case of 
the neutrophilic methanogen M. fervidus, only acti- 
vated oligosaccharides have been isolated from cell 
extracts (50, 78). In both species dolichol derivatives 
serve as lipid carriers, while in the case of the pseudo- 
murein and methanochondroitin, undecaprenyl py- 
rophosphate functions as lipid carrier. According to 
the current knowledge, dolichyl-P-(P) is the universal 
lipid carrier in glycoprotein synthesis in Archaea, 
Bacteria, and Eucarya (78). 


Depending on the glycan chain, C¢o-dolichyl- 
monophosphate and dolichylpyrophosphate serve as 
the lipid carriers for the two sulfated glycan chains, 
the glycosaminoglycan and the second sulfated oligo- 
saccharide, respectively. The complete glycosamino- 
glycan, including sulfation, is synthesized inside the 
cytoplasmic membrane at the lipid carrier dolychyl-P 
and then transferred to the nascent protein. The link- 
age to the protein takes place at the cell surface. The 
acceptor peptide is Asn-Ala-Ser. Replacement of the 
serine residue of the consensus sequence by valine, 
leucine, and asparagine did not prevent N-glycosyla- 
tion. N-Glycosylation did not occur at Asn-479, 
when Ser-481 was removed (164), which indicates the 
presence of two different N-glycosyltransferases. In 
the case of the second sulfated oligosaccharide (c.f. 
above), completely sulfated lipid-linked precursors 
are formed before transfer to the protein. Prior to the 
transfer of this saccharide chain to the cell surface, 
some glucose residues are transiently methylated at 
carbon 3 (139, 156). Dolichol-linked oligosaccharides 
are also involved in the biosynthesis of the S-layer 


a-D-3-O-MetManp-(1->6)-a-D-3-O-MetManp-[(1->2)-a-D-Manp],-(1->4)-D-GalNAc 


Figure 13. Proposed structure of the oligosaccharide of the S-layer glycoprotein of M. fervidus. Modified from The Journal of 


Biological Chemistry (66) with permission of the publisher. 
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Figure 14. Proposed structure of the three glycan moieties of the S-layer glycoprotein of H. salinarum. The building block com- 
posed of five different sugars (glycan I) is linearly repeated 10 to 12 times. Modified from Halophilic Bacteria, vol. 2, CRC 


Press, Inc. (156), with permission of the publisher. 


glycoprotein of H. volcanii (86). Along with protein 
glycosylation, additional posttranslational modifica- 
tion of the S-layer glycoprotein occurs at the exter- 
nal face of the plasma membrane of H. volcanii (38) 
(see Chapter 11). 

Tunicamycin and bacitracin interfer with glyco- 
sylation in H. salinarum, while the glycan synthesis 
of the S-layer glycoprotein of H. volcanii is not af- 
fected (38). In contrast to H. salinarum, where Ceo- 
dolichyl-monophosphate and dolichylpyrophosphate 
serve as the lipid carriers, only dolichyl monophos- 
phate oligosaccharides were found in H. volcanii (86, 
89). For comparison, Sulfolobus acidocaldarius is 
also sensitive to tunicamycin (38). Experiments with 
exogenously added peptides with an N-glycosylation 
site, which can not penetrate the cytoplasmic mem- 
brane, indicated an oligosaccharide transfer outside 
of the cytoplasmic membrane (89), because oligosac- 
charides were linked to these peptides by the cells. 
The proteins can also undergo further posttransla- 
tional modifications, which may include isoprenyla- 
tion (83) or linkage of diphytanylglyceryl phosphate 


residues (76). In addition to H. salinarum and H. vol- 
canii, the S-layer protein from Haloarcula japonica 
has also been purified and heterologously expressed 
(103, 147, 148). 

A glycoprotein S-layer with a three-dimensional 
structure and some properties similar to that of 
Halobacteria was observed and described for Methan- 
oplanus limicola (24). Its dome-shaped complexes are 
arranged on a lattice with p6 symmetry and a center- 
to-center distance of 14.7 nm. In the cell envelope, 
they are in proximity to the cell membrane; the 
periplasm is 5 to 10 nm wide. The protein mass was 
determined to be at 135 kDa (native, glycosylated) 
and 115 kDa (after deglycosylation). 


GLUTAMINYLGLYCAN IN NATRONOCOCCI 


The majority of bacterial and archaeal exopoly- 
mers are polysaccharides, but exopolymers composed 
of L- or D-glutamate are also formed. In Bacteria, in 
general, a-linked polymers composed of glutamyl or 
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glutaminy] residues possess the L-configuration, while 
in y-linked exopeptides the glutamic acid residues 
have the D-configuration. The poly-y-p-glutamyl 
polymers occur in the phylogenetically related gen- 
era Bacillus, Sporosarcina, and Planococcus. Similar 
polymers have been found in the archaeal genus Na- 
tronococcus (107), which are extremely halophilic 
cocci that grow optimally in alkaline and saline 
biotopes. The cell polymer of N. occultus is formed 
by a polyglutamine, which, in contrast to the bacter- 
ial polymer, is glycosylated. The cell wall polymer is 
composed of one amidated amino acid (L-glutamine), 
two amino sugars (glucosamine and galactosamine), 
two uronic acids (galacturonic acid and glucuronic 
acid), and a hexose sugar (glucose). In the intact poly- 
mer, the glutamine residues are linked via their y-car- 
boxylic group, forming a chain of about 60 monomers. 
Two types of oligosaccharides are linked to the polyg- 
lutamine backbone (Fig. 15). One oligosaccharide 


region A 
y Y Y 


region B 


consists of an N-acetylglucosamine pentasaccharide 
at the reducing end and a galacturonic acid oligosac- 
charide at the nonreducing end. The second oligosac- 
charide has an N-acetylgalactosamine disaccharide 
at the reducing end and a maltose unit at the non- 
reducing end. 

N-linked oligosaccharides of bacterial and ar- 
chaeal cell wall glycoconjugates are usually linked via 
galactosamine, glucose, or rhamnose to the B-amide 
group of asparagine, while in N. occultus the oligo- 
saccharides are linked to the a-carboxylic groups of 
the polyglutamine backbone via an N-amide linkage. 
Therefore, the exopolymer of N. occultus represents a 
novel type of naturally occurring glycoconjugates. 

In Bacillus anthracis the polyglutamate polymer 
plays an important role in pathogenesis. It has not 
been determined whether the structurally related glu- 
taminylglucan of the natronococci is a pathogenic 
factor. 
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Figure 15. Proposed structure of the repeating units (regions A to C) of the cell wall polymer of N. occultus. Modified from 
the European Journal of Biochemistry (107) with permission of the publisher. 
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EXTRACELLULAR CANNULAE ARE 
A SPECIFIC FEATURE OF THE 
GENUS PYRODICTIUM 


All known Pyrodictium species are able to grow 
at temperatures between 75 and 110°C under strictly 
anoxic conditions. The cells are covered by an S layer, 
i.e., a two-dimensional protein array, with hexago- 
nal symmetry and a lattice constant of ~21 nm (Fig. 
16a). The dome-shaped complexes are anchored to 
the cytoplasmic membrane by filiform stalks and span 
the periplasmic space with a constant width of ap- 
proximately 35 nm (6, 37, 51). During growth on el- 
emental sulfur, Pyrodictium cells form an extracellu- 
lar matrix, in which the cells are entrapped. This 
matrix is an extended network of hollow cylinders 
consisting of helically arranged subunits (80, 121). 
Each of the cannulae has an outer diameter of 25 nm 
(Fig. 16b, c) and is made up of (at least) three homol- 
ogous glycoproteins. In vivo observations of grow- 
ing Pyrodictium cells at 90°C under anoxic condi- 
tions demonstrated that this network is dynamic; cell 
division and synthesis of the cannulae are directly 
linked (53). After cell division, the daughter cells re- 
main interconnected by cannulae loops. Multiple 
generations result in the formation of a colony of cells 
entrapped in a dense cannulae network, in which 
each cell has connections with its neighbors. Analy- 


sis of dual-axis tilt series in cryoelectron tomography 
helped to reveal that the cannulae interconnect indi- 
vidual cells with each other, but only on the level of 
the periplasmic spaces of the cells; the cannulae do 
not enter the cytoplasms (Fig. 16c) (106). 


THE OUTER MEMBRANE OF IGNICOCCUS 
AND THE SURFACE LAYER OF 
NANOARCHAEUM 


From a sample taken at a hydrothermal vent of 
the Kolbeinsey Ridge, north of Iceland, a coculture of 
two coccoid, hyperthermophilic archaea was obtained 
(55). Based on 16S rRNA gene sequences, one belongs 
to the genus Ignicoccus, while the other was so differ- 
ent that it was attributed to a new phylum, Nanoar- 
chaeota. Cells of Ignicoccus can be cultivated inde- 
pendently under strictly anaerobic conditions and 
thrive by sulfur-hydrogen autotrophy. However, Nano- 
archaeum equitans can only be grown in coculture 
with, and in close contact to, cells of Ignicoccus sp. 
strain KINA. Ignicoccus cells are unique among the 
Archaea in two important ways: (i) they have a huge 
periplasm with variable width (20 to 500 nm), which 
contains vesicles; (ii) cells do not possess an S layer or 
any other cell wall polymer but have a unique outer 
membrane. Freeze-etch experiments revealed that the 


Figure 16. (See the separate color insert for the color version of this illustration.) Transmission electron micrographs of P. occul- 
tum. (a) Pyrodictium cells after freeze etching, exhibiting an S-layer with p6 symmetry. Bar = 0.5 wm. (b) Extracellular can- 
nulae, a specific feature of the genus Pyrodictium; negative staining with uranyl acetate; bar = 0.1 wm. (c) 3D tomogram of a 
frozen-hydrated Pyrodictium cell with two cannulae; bar = 0.2 um. Modified from the Journal of Structural Biology (106) with 
permission of the publisher. 


334 KONIG ET AL. 


Ignicoccus outer membrane fractures into two halves. 
This type of behavior is similar to the way in which 
many biological lipid membranes respond during 
freeze etching. The outer membrane is a highly dy- 
namic structure: periplasmic vesicles can be observed 
in various stages of a fusion process, and vesicles are 
also released into the culture medium. The outer mem- 
brane is tightly packed with protein complexes, al- 
though there is no indication for crystallinity and the 
proteins are presently being investigated (104). 
Ignicoccus sp. strain KIN4I and N. equitans both 
contain qualitatively identical amounts of glycerol 
ether lipids, archaeol and, to a minor degree, cald- 
archaeol (60). Cells of N. equitans are the smallest 
archaeal cocci presently known, and their ability to re- 
produce relies on the direct interaction with Ignicoccus 
cells. The reliance on a host is reflected in its genome 
sequence, which lacks genes for many important meta- 
bolic pathways (149). The ultrastructure of N. equi- 
tans is similar to many archaea. The cytoplasmic mem- 
brane is surrounded by a quasi-periplasmic space of 
constant width (20 nm) and covered by an S layer with 
sixfold symmetry and a lattice constant of about 15 nm 
(56). Future investigations of the unusual symbiosis of 
these two hyperthermophilic archaea aim at elucidat- 
ing which proteins of both cell envelopes are directly 
involved in the physical interaction and in the ex- 
change of metabolites from one cell to the other. 


PERSPECTIVE: THE NEXT FIVE YEARS 


The cell envelopes of the Archaea are often di- 
rectly exposed to extreme environmental conditions, 
and they cannot be stabilized by cellular factors. 
Adaptation to the same type of extreme environment 
has not led to the evolution of similar cell surface 
structures. For example, Halobacterium sp., Halo- 
coccus sp., and Natronococcus sp. thrive in saturated 
salt brines, but they possess quite different cell wall 
polymers (e.g., glycoprotein, heterosaccharide, or glu- 
taminylglycan). Differences in cell surface structures 
also exist for hyperthermophilic archaea. In this way 
extremophilic archaea provide a range of models for 
elucidating survival strategies of natural compounds 
and may give clues about molecular mechanisms of 
resistance against high temperature, low and high 
pH, and high-salt concentrations (41). Through stud- 
ies of their cell surfaces, these investigations may lead 
to applications of new biomaterials. S layers represent 
the most common cell surface layer of Archaea. They 
are the simplest biological membranes found in na- 
ture. A wide spectrum of applications for S layers has 
already emerged (see Chapter 22). Isolated S-layer 
subunits assemble into monomolecular crystalline ar- 


rays in suspension, on surfaces and interfaces. These 
lattices have functional groups on the surface in an 
identical position and orientation in a nanometer 
range. These characteristics have led to their applica- 
tion as ultrafiltration membranes, immobilization 
matrices for functional molecules, affinity microcarri- 
ers and biosensors, conjugate vaccines, carriers for 
Langmuir-Blodgett films, and reconstituted biological 
membranes and pattering elements in molecular nan- 
otechnology (130). In the past, applied studies have 
almost exclusively been performed with bacterial S 
layers. Since archaeal S-layer (glyco-)proteins are of- 
ten resistant under extreme conditions, a new spec- 
trum of future developments with archaeal S-layer 
glycoproteins should be found. 

The genes of some archaeal S-layer proteins have 
been characterized (2, 17, 25-27, 88, 138, 161), and 
complete genome sequences have been published for 
many of the species described in this chapter (4, 19, 23, 
28, 44, 46, 52, 68-70, 105, 127, 131, 149). To date, 
cell wall biosynthesis genes have not been unambigu- 
ously defined. Knowledge of the complete genome may 
be helpful for identifying special enzymes involved in 
the biosynthesis and degradation of cell wall polymers. 

With respect to the evolution of methanococci, 
it seems that the common ancestor of the Methano- 
coccales was probably a thermophile [75] and that 
mesophily is a modern adaptation. An exchange of an 
amino acid in mesophilic proteins may be the result 
of a relaxation of selection against this amino acid, 
which may be of importance in the extreme ther- 
mophilic counterparts. 

The resolution of the 3D structure will be neces- 
sary to get a better knowledge of the molecular sta- 
bilization mechanisms of hyperthermophilic S-layer 
proteins. The first successful crystallizations indicate 
that this goal may be achievable (27, 32, 41, 113). 
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Chapter 15 


Lipids: Biosynthesis, Function, and Evolution 


YAN BOUCHER 


INTRODUCTION 


One of the fundamental ways in which the Ar- 
chaea differ from the Bacteria and Eucarya is the 
composition of their cell membrane. Core membrane 
lipids, in all domains of life, consist of two hydrocar- 
bon side chains linked to a glycerol moiety. In bacteria 
and eucarya, fatty acid side chains are ester linked to 
an s-glycerol-3-phosphate (G3P) moiety. In archaea, 
on the other hand, isoprenoid side chains are ether 
linked to an sn-glycerol-1-phosphate (G1P) backbone. 
The side chains and the type of bond that links them 
to the glycerol moiety differ only quantitatively be- 
tween the Archaea and the other two domains of life. 
Indeed, a minor proportion of ether-linked lipids can 
be found in some eucarya (27) and bacteria (16), and 
fatty acid side chains are part of some archaeal lipids 
(11). The sn-1 stereochemistry of the glycerol back- 
bone, on the other hand, is truly unique to Archaea and 
remains the most distinctive feature of their lipids. 

A great diversity of archaeal lipids are derived 
from the sm-2,3-diacylglycerol diether basic structure. 
The most common is sv-2,3-diphytanylglycerol diether 
(archaeol) (Fig. 1), which is a core lipid in most archaea 
(Table 1) (8). Variations of this structure include the 
length of either or both side chains (25-carbon sester- 
terpanyl chain instead of a 20-carbon phytanyl chain), 
introduction of a hydroxy group at the C-3 of the 
phytanyl chain on the sv-3 position (hydroxyarchaeol), 
and linking of the terminal portion of the phytanyl 
chains to create a macrocyclic diether (Fig. 1). One 
of the most frequent and functionally important vari- 
ations in structure is the head-to-head condensation of 
two diether lipid molecules to form a glycerol-dialkyl- 
glycerol tetraether lipid (GDGT) (Fig. 2). The most 
widespread of these is caldarchaeol (GDGT-0), which 
does not contain any modification to its alkyl core. 
Several variations of this basic tetraether lipid also ex- 
ist, including the introduction of a number of penta- or 


hexacyclic rings, the replacement of one of the glycerol 
moieties by nonitol, a break in one of the C40 alkyl 
chains, and the introduction of a bridge between the 
alkyl chains (Fig. 2). Diether and tetraether lipids usu- 
ally have a polar head group or sugar attached to the 
sn-1 position of their glycerol moieties, and can display 
various levels of unsaturation in their side chains (41). 

The biosynthetic steps leading to this wide vari- 
ety of core lipids are not all well understood, but many 
enzymes involved in assembling backbone structure 
have been identified and characterized. The synthesis 
of the isoprenoid building blocks, isopentenyl diphos- 
phate (IPP) and dimethylallyl diphosphate (DMAPP), 
their assembly in side chains, and attachment to the 
sn-glycerol-1-phosphate moiety are relatively well un- 
derstood. However, the order and nature of subsequent 
steps, such as attachment of head groups or sugars, 
saturation of the side chains and synthesis of tetraether 
lipids, are not as clearly defined. 

The variety of lipid structures observed in archaea 
is underlined by a variation in the subset of biosyn- 
thetic genes present in different organisms as well as 
the functional flexibility of those genes. The meval- 
onate pathway describes the synthesis of the IPP and 
DMAPP building blocks that form the isoprenoid side 
chains of archaeal lipids. This pathway evolved through 
a combination of processes that included orthologous 
and nonorthologous gene displacement, integration 
of components from eucarya and bacteria, and lateral 
gene transfer between members of the Archaea. Iso- 
prenyl phosphate synthases are responsible for elon- 
gating side chains using the end products of the meval- 
onate pathway. In some archaeal phyla they have 
gained the capacity to make chains longer than the 
basic 20 carbons through amino acid substitutions at 
a few specific positions. However, the most intriguing 
evolutionary development is the archaeal invention of 
three stereospecific enzymes, from broader protein 
families, that led to the unique chirality of their lipids: 
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Figure 1. Basic structure of glycerol diether isoprenoid lipids. The 
sugars or polar head groups that are frequently attached to the 
sn-1 position of the glycerol moiety in archaeal diether lipids are 
not shown. 


sn-glycerol-1-phosphate dehydrogenase (synthesis of 
the sn-1-glycerol phosphate backbone), geranylger- 
anylglyceryl phosphate synthase, and digeranylgeranyl- 
glyceryl phosphate synthase (addition of the sn-2 and 
sn-3 side chains to the glycerol moiety, respectively). 
The function of the structurally diverse core di- 
ether lipids is just beginning to be addressed. Some 
types of modifications, such as the introduction of pen- 
tacyclic rings or side-chain unsaturation, have been 
hypothesized to play a role in adaptation to different 
growth temperatures (12, 33). Likewise, some lipid- 
profile studies have suggested that the proportion of 
different lipids in the cell membrane can vary with en- 
vironmental conditions such as pressure, or physio- 
logical conditions such as growth phase (19, 47). 
This chapter summarizes the different biosynthetic 
steps of isoprenoid ether lipid biosynthesis in archaea, 
describing the underlying enzymatic reactions that 
have been characterized. The evolution of this lipid 


biosynthesis apparatus in a variety of archaea is dis- 
cussed. The effect of the environment on the nature of 
the lipids present in archaeal cell membranes is also 
scrutinized in an attempt to link structure and function. 


BIOSYNTHETIC PATHWAYS 


Synthesis of the Isoprenoid Building Blocks: 
IPP and DMAPP 


Similar to other isoprenoids, archaeal lipid side 
chains are assembled from two universal precursors: 
isopentenyl diphosphate (IPP) and its isomer dimethy- 
lallyl diphosphate (DMAPP). Most eucarya and some 
bacteria synthesize these precursors through the meval- 
onate pathway (1, 26). This pathway has been described 
in detail in these organisms and is always composed 
of five steps, as illustrated in Fig. 3: (i) conversion of 
acetyl-CoA and acetoacetyl-CoA to 3-hydroxy-3- 
methylglutaryl-CoA (HMG-CoA); (ii) reduction of 
HMG-CoA to mevalonate; (iii) phosphorylation of 
mevalonate; (iv) phosphorylation of phosphomeval- 
onate; and (v) conversion of diphosphomevalonate 
to IPP. Archaea also derive their isoprenoids from 
mevalonate, as shown by tracer studies more than two 
decades ago (21). Orthologs of the enzymes catalyzing 
the first three steps of the pathway in bacteria and eu- 
carya can also be found in archaea (45). However, the 
last two enzymes of the standard bacterial/eucaryal 
mevalonate pathway (phosphomevalonate kinase [PMK] 
and diphosphomevalonate decarboxylase [PPMD]) 
are missing in most archaea (Table 1). A few excep- 
tions are known, including the PPMD gene present 
in the Halobacteriales and Thermoplasmatales, and 
PMK and PPMD in Sulfolobales. The conversion of 
the PPMD product, IPP, to the other essential iso- 
prenoid precursor, DMAPP, is performed by isopen- 
tenyl diphosphate isomerase (IDI), of which two anal- 
ogous types exist. All archaea harbor IDI2, but the 
Halobacteriales also possess an IDI1, the latter being 
the enzyme used in most eucarya and bacteria for the 
biosynthesis of DMAPP. 


Elongation of the Isoprenoid Side Chains 


The products of the mevalonate pathway (IPP 
and DMAPP) are used by archaea as building blocks 
for lipid side chains. These side chains are composed 
of 20 or 25 carbons (C20 or C25). The full length is 
reached through the sequential condensation of IPP to 
a growing allylic polyisoprenoid diphosphate (Fig. 3). 
The first molecule in the chain is the IPP isomer, 
DMAPP, to which IPP molecules are successively added 
to obtain GPP (geranyl diphosphate, 10 carbons), FPP 


eve 


Table 1. Distribution of lipid biosynthesis enzymes and core lipids in the Archaea? 


Enzymes? Core lipids! 


Archaeal groups 
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“+, present; —, absent or too divergent to be detected; B, bacterial type; E, eucaryal type; D, divergent from main archaeal type but origin uncertain; ?, no information available on presence or absence (uncultured archaeal group). 
ÞHMGS, 3-hydroxy-3-methylglutaryl coenzyme A synthase; HMGR, 3-hydroxy-3-methylglutaryl coenzyme A reductase; MK, mevalonate kinase; PMK, phosphomevalonate kinase; PPMD, diphosphomevalonate decarboxylase; 
IDI1/IDI2, isopentenyl diphosphate isomerase type 1 and 2; GGPPS, geranylgeranyl diphosphate synthase; FGPPS, farnesylgeranyl diphosphate synthase; GGGPS, geranylgeranylglyceryl phosphate synthase; DGGGPS, digeranyl- 
geranylglyceryl phosphate synthase; ArOH, archaeol; G1PDH, sv-glycerol-1-phosphate dehydrogenase. 

“ArOH, archaeol; C25 chains, presence of one or two sesterpany! side-chains; Cyc-ArOH, macrocyclic archaeol; OH-ArOH, hydroxyarchaeol; CAOH, caldarchaeol (GDGT-0); GDGT (1-8), glycerol-dialkyl-glycerol tetraether 
(1-8 pentacyclic rings); GDNT, glycerol-dialkyl-nonitol tetraether; GTGT, glycerol-trialkyl-glycerol tetraether; FU, H-shaped caldarchaeol derivative; CreOH, crenarchaeol. 
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Figure 2. Basic structure of glycerol tetraether isoprenoid lipids. GDGT (glycerol-diakyl-glycerol tetraether) can display a num- 
ber of cyclic rings (0 to 8) in its alkyl core. In GTGT (glycerol-triakyl-glycerol tetraether), only two of the four phytanyl side 
chains from the precursor diether lipids are linked by a C—C bond. Although only the antiparallel configuration is shown for 
the two glycerols forming the backbone of the tetraether lipids, both isomers are likely to be found in archaeal cells (43). GDNT 
(glycerol-dialkyl-nonitol tetraether) versions of most GDGTs, where one of the glycerol moieties is replaced by nonitol, are also 
found in a variety of archaea. The sugars or polar head groups usually attached to the hydroxy of the glycerol moieties in archaeal 


lipids are not shown. 
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Figure 3. Pathways for the biosynthesis of archaeal glycerol ether isoprenoid lipids. Boxed “A” beside an enzyme name indi- 
cates that the enzyme is found in all archaea, and boxed “S” indicates that the enzyme is found in some archaea. 
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(farnesyl diphosphate, 15 carbons), GGPP (geranylger- 
anyl diphosphate, 20 carbons), and FGPP (farnesylger- 
anyl diphosphate, 25 carbons). These chain-elongation 
reactions are catalyzed by isoprenyl diphosphate syn- 
thases, such as GGPP and FGPP synthases, which differ 
from one another by the allylic substrate they accept 
to start the elongation process, and the chain length of 
their product(s) (22). In archaea, GGPP synthase can 
elongate DMAPP to obtain both FPP and GGPP, the 
latter being the isoprenyl forming the side chains of 
C20-C20 diether lipids. Archaea harboring C20-C25 
or C25-C25 diether lipids require an FGGP synthase, 
which can either elongate directly from DMAPP or 
from longer allylic substrates like GGPP. Among the 
Crenarchaeota, only Aeropyrum pernix is known to 
display C25 side chains, C25-C25 archaeol being the 
sole core lipid found in this species. The only Eury- 
archaeota known to have one or two C25 chains in 
their core lipids are haloalkaliphilic archaea from the 
order Halobacteriales (17). 


Synthesis of the Glycerol Phosphate Backbone 


Although the unique stereoconfiguration of ar- 
chaeal lipids has been known for decades, there have 
been questions about the precursor used to synthesize 
the glycerol phosphate backbone. The sz-glycerol-1- 
phosphate (G1P) dehydrogenase enzyme from Metha- 
nothermobacter thermautotrophicus was identified a 
decade ago to be responsible for the synthesis of the 
sn-glycerol-1-phosphate phospholipid backbone from 
dihydroxyacetone phosphate (DHAP) (Fig. 3) (34). 
It was subsequently demonstrated that G1P dehydro- 
genase activity was exhibited by other archaea (36) 
and that a G1P dehydrogenase gene is present in all 
sequenced archaeal genomes (3). 


Linking Backbone and Side Chains 


The enzyme responsible for linking the first side 
chain to the glycerol phosphate backbone of C20-C20 
diether lipids was first characterized from M. therm- 
autotrophicus (46). This enzyme, termed geranylger- 
anylglyceryl phosphate (GGGP) synthase, is encoded 
in all sequenced archaeal genomes, with the exception 
of the obligate parasite/symbiont Nanoarchaeum eq- 
uitans (15). GGGP synthase strongly favors sn-glyc- 
erol-1-phosphate as a substrate, and therefore plays a 
role in defining the stereoconfiguration of archaeal 
lipids. The enzymatic activity responsible for adding 
the second C20 isoprenoid side chain in M. therm- 
autotrophicus diether lipids, digeranylgeranylglyceryl 
phosphate (DGGGP) synthase, was identified in a cell 
extract that separately contained activity for GGGP 
synthase (5). The enzymatic activity of a recombinant 


DGGGP synthase from Sulfolobus acidocaldarius has 
also been characterized in vitro (14). 


Addition of Polar Head Groups 


The product of DGGGP synthase, sv-2,3-diger- 
anylgeranylglyceryl phosphate, is the substrate of 
CDP-archaeol synthase, which adds a cytidine group 
at the sn-1 position of the glycerol moiety to produce 
unsaturated CDP-archaeol (Fig. 3). As GGGP syn- 
thase, DGGGP synthase, and CDP-archaeol synthase 
all require fully unsaturated isoprenyl side chains as 
substrate (23), the saturation of double bonds must 
occur after the formation of unsaturated CDP-archaeol. 
By analogy to the fatty acid biosynthesis pathway in 
bacteria, unsaturated CDP-archaeol is thought to be 
a precursor for the addition of polar head groups or 
sugars at the sv-1 position (30). The nature of the head 
groups or sugars found in archaeal lipids varies greatly 
in different taxa (20, 24). The head groups of phospho- 
lipids can be a variety of polar compounds (glycerol, 
serine, inosine, ethanolamine, myo-inositol, aminopen- 
tanetretols), and sugar moieties of glycolipids can take 
many forms (glucose, mannose, galactose, gulose, N- 
acetylglucosamine, or combinations thereof) (13, 17, 
20, 25, 31, 44). None of the enzymes responsible for 
the addition of sugar moieties are known, and only one 
enzyme catalyzing the addition of a polar head group 
has been characterized (archaetidylserine synthase, re- 
sponsible for the addition of L-serine) (29). 


Modification of the Side Chains: 
Hydrogenation and Cyclization 


None of the enzymes involved in the saturation of 
isoprenoid side chains, introduction of cyclic rings in 
the alkyl core of tetraether lipids, or cyclization of di- 
ether lipids (formation of macrocyclic diether lipids or 
tetraether lipids) are currently known. However, some 
experiments have revealed information about the or- 
der in which these biosynthetic reactions might take 
place. Nemoto et al. (32) demonstrated that the di- 
ether lipid precursors of Thermoplasma acidophila 
tetraether lipids contained a modified polar head group 
(glycerophosphate). Addition of polar head groups 
was suggested to precede the formation of a tetraether 
molecule in S. acidocaldarius, where two molecules of 
archaetidylinositol would be precursors to the tetra- 
ether lipids (48). In T. acidophila, the sugar moieties of 
glycolipids only seem to be attached after the synthesis 
of the tetraether lipid core (32), and this may also be the 
case in S. acidocaldarius. 

It is unclear whether saturation of side chains oc- 
curs during or after the head-to-head condensation 
reaction leading to tetraether lipids. An unusual double- 
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bond migration can occur along the unsaturated side 
chains of lipid precursors in M. thermautotrophicum 
and Methanococcus jannaschii, leading to an isomer- 
ized intermediate with a terminal double bond. Ac- 
cording to Eguchi et al. (9), this isomer could be the 
precursor for the creation of C—C bonds leading to 
macrocyclic and tetraether lipids. In this proposed 
mechanism, the formation of a C—C bond is trig- 
gered by the addition of a proton, which suggests that 
it would occur simultaneously with the saturation of 
the side chains. On the other hand, Nemoto et al. (32) 
have shown from mass spectrometry experiments of 
the lipids of T. acidophila that the diether precursors 
of the tetraether lipids are fully saturated. They sug- 
gest that saturation/unsaturation reactions could be 
reversible and that a terminal double bond could be 
reintroduced before the condensation reaction leading 
to the tetraether lipid. 

The presence of ether lipids with partially satu- 
rated side chains in the methanogens, Methanococ- 
coides burtonii and Methanopyrus kandleri, and the 
halophiles, Halobacterium salinarum and Natrono- 
bacterium magadii, adds a level of complexity by dem- 
onstrating that saturation reactions can also be par- 
tial (33, 35, 41). Moreover, the introduction of cyclic 
rings in the side chains, including head-to-head con- 
densation of diether lipids, is thought to require the 
presence of some double bonds. However, similar to 
the studies on partially saturated side chains (34, 36, 
42), there is little experimental evidence on when 
these structures are formed during the process of ar- 
chaeal lipid biosynthesis. 


EVOLUTION OF LIPID 
BIOSYNTHESIS IN ARCHAEA 


Origins of the Archaeal Mevalonate Pathway 


All organisms harboring a functional mevalonate 
pathway possess homologous enzymes catalyzing the 
first three steps: HMG-CoA synthase (HMGS), HMG- 
CoA reductase (HMGR), and mevalonate kinase 
(MVK) (Table 1). As a rule, Life’s three domains (Bac- 
teria, Archaea, and Eucarya) exhibit monophyly for 
these enzymes (enzymes within a domain are more 
similar to each other than between domains). How- 
ever, there are some striking and well-supported ex- 
ceptions. Some of the archaeal genes in the mevalonate 
pathway appear to have been acquired by lateral gene 
transfer (LGT). For example, all representatives of the 
Euryarchaeota from the Archaeoglobales and Ther- 
moplasmatales harbor an HMGR gene of seemingly 
bacterial origin (2). 

In the mevalonate pathway of bacteria and fungi, 
the phosphorylation of mevalonate is catalyzed by 


mevalonate kinase (MVK), and the further phospho- 
rylation of phosphomevalonate is catalyzed by a ho- 
mologous protein, phosphomevalonate kinase (PMK). 
In archaea, PMK is only found in the genus Sul- 
folobus and is most closely related to the PMKs from 
fungi (3). A protein with high similarity to PPMD 
from eucarya is encoded adjacent to the PMK in the 
genome sequences of S. tokadaii and S. solfataricus. In 
phylogenetic analyses, this protein clusters with eu- 
caryal homologs (3). This suggests that the PMK and 
PPMD found in Sulfolobus were acquired from eu- 
carya by one of their ancestors. The presence of these 
two enzymes only in Sulfolobus signifies that this is 
the sole genus of the Archaea known to possess all 
five enzymes of the standard mevalonate pathway. 
However, Sulfolobus are not the only archaea to har- 
bor a PPMD. This enzyme is also found in the 
Halobacteriales and the Thermoplasmatales (Table 
1). Phylogenetic analysis indicates that the PPMD in 
these archaea are likely to have a different origin than 
the Sulfolobus enzymes. Indeed, these PPMD genes 
cluster with bacterial homologs and have distinct se- 
quences from the eucaryal enzymes (3). 

The isoprenoid biosynthesis enzyme isopentenyl 
diphosphate isomerase (IDI) type 1 (IDI1) is found in 
multiple genera of Halobacteriales and is possibly 
ubiquitous in this group (3). However, it is not found 
in any other archaea (Table 1). The fact that this en- 
zyme is common in Eucarya and Bacteria but appears 
to be restricted to Halobacteriales in Archaea indicates 
that it was acquired by an ancestral Halobacteriales 
from either of these domains. All archaea, including 
the Halobacteriales, harbor the analogous type 2 IDI 
(IDI2) described by Kaneda et al. (18). Type 2 IDI is 
therefore likely to have been present in the ancestors 
of archaea, with a more recent acquisition of the type 
1 IDI in the Halobacteriales. 


Variation of Isoprenoid Side-Chain Length 


The extent to which isoprenyl diphosphate syn- 
thases can synthesize isoprenoid chains of varying 
length was demonstrated by mutational studies. A sin- 
gle amino acid substitution changed S. acidocaldarius 
GGPP synthase into an enzyme that synthesized FGPP 
as its main product, with small amounts of hexaprenyl 
diphosphate as a secondary product (30 carbons iso- 
prenyl diphosphate) (37). The same enzyme with two 
or three amino acid substitutions was capable of con- 
verting DMAPP, FPP, or GPP to a main product of 35 
or 40 carbons in length (heptaprenyl and octaprenyl 
diphosphates) and secondary products of 65 to 120 car- 
bons in length (38). 

Such a change in the length of synthesized products 
has occurred for the isoprenyl diphosphate synthase 
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of A. pernix. This archaeon only synthesizes disestert- 
erpanyl (C25-C25) diether lipids. This is rare among 
archaea as most only produce diphytanyl C20-C20 di- 
ether lipids (8). A. pernix does not possess a GGPP 
synthase, but instead encodes a single FGPP synthase 
(49). The FGPP synthase was probably derived from 
an ancestral archaeal GGPP synthase; phylogenetic 
analysis places the A. pernix FGPP synthase in a group 
with archaeal GGPP synthases (3). A similar change in 
isoprenyl diphosphate synthase polyisoprenoid chain 
elongation properties must have occurred for haloalk- 
iliphilic species of Halobacteriales, as they are the only 
other archaea to harbor sesterterpany]l side chains in 
their lipids. However, these archaea maintained GGPP 
synthase activity and synthesize C20-C20, C20-C25, 
and/or C25-C25 core lipids (17). It is not known 
whether these Halobacteriales have separate GGPP 
and FGPP synthases or an enzyme that displays plas- 
ticity in the chain length of its products. 


Emergence of the Specific Stereochemistry 
of Archaeal Lipids 


Three enzymes involved in the biosynthesis of 
archaeal lipids show stereospecificity. sn- glycerol-1- 
phosphate (G1P) dehydrogenase introduces stereo- 
specificity into archaeal lipids by specifically synthesiz- 
ing glycerol phosphate with the sn-1 stereoconfiguration 
from dihydroxyacetone phosphate (DHAP) (34). 
Geranylgeranylglyceryl phosphate (GGGP) synthase 
strongly favors this glycerol phosphate stereoisomer 
when attaching the first isoprenoid side chain, yield- 
ing sn-3-geranylgeranylglyceryl phosphate. The at- 
tachment of the second side chain is also stereospe- 
cific, as DGGGP synthase will only recognize the sn-1 
phosphate monoether as opposed to an sn-3 phos- 
phate monoether substrate, yielding only unsaturated 
archaeol (sv-2,3-digeranylgeranylglyceryl phosphate 
or DGGGP) as a product (5). CDP-archaeol synthase 
then catalyzes the condensation of a CTP molecule 
with the unsaturated archaeol released by DGGGP 
synthase to obtain unsaturated CDP-archaeol (30). 
This enzyme does not recognize the stereochemical 
structure of the glycerol phosphate backbone or the 
linkage between glycerol and the isoprenoid side 
chains (ester or ether linkage). It therefore seems that 
the specific stereoconfiguration of archaeal lipids is 
established by G1P dehydrogenase, as well as the 
GGGP and DGGGP synthases. 

G1P dehydrogenase and glycerol dehydrogenase 
catalyze similar reactions; the substrate (dihydroxy- 
acetone phosphate or dihydroxyacetone) and product 
(sa-glycerol-1-phosphate or glycerol) being phospho- 
rylated for the former, but not the latter, enzyme. These 
enzymes are also homologous (sharing about 20 to 


25% amino acid identity) members of the NAD-de- 
pendent dehydrogenase superfamily (Fig. 3). Although 
G3P and G1P dehydrogenases are functionally equiv- 
alent (with the exception of their stereospecificity), 
they share little sequence similarity. Phylogenetic 
analysis presents G1P dehydrogenases as a mono- 
phyletic cluster among the larger NAD-dependent de- 
hydrogenase superfamily (Fig. 4). This suggests that 
G1P dehydrogenase is an archaeal invention derived 
from an enzyme of the NAD-dependent dehydroge- 
nase superfamily, possibly glycerol dehydrogenase, 
which shares similar sequence, substrate, and product. 

The enzyme adding the first isoprenoid side chain 
to the glycerol backbone, GGGP synthase, is selec- 
tive for sn-glycerol-1-phosphate as an acceptor (5). 
A surprising finding from phylogenetic analysis is the 
existence of two distinct but homologous types of 
GGGP synthases. Several Halobacteriales as well as 
Archaeoglobus fulgidus possess divergent enzymes 
that cluster with homologs from various species of the 
bacterial order Bacillales (Fig. 4). Two paralogs of this 
divergent enzyme are present in a few species of Halo- 
bacteriales. The role of these two paralogs is unclear. 
It is possible that one of them encodes a farnesylger- 
anylglyceryl phosphate synthase, as some Halobacte- 
riales (including Haloterrigena, Halococcus, Natrono- 
bacterium, Natrinema, Natrialba, and Natronomonas) 
produce C20-C25 and/or C25-C25 diether lipids (17). 
However, some Halobacteriales that only have C20- 
C20 lipids (Haloferax and Halobacterium) also ex- 
hibit two paralogs. 

Due to its presence in several Halobacteriales 
genera, the divergent type of GGGP synthase was 
most likely present in the ancestors of this order (and 
possibly of the order Archaeoglobales as well) (Fig. 4). 
How this divergent type of GGGP synthase arose is 
unclear. As GGGP synthase homologs are ubiquitous 
in archaea and only found in the order Bacillales, a 
Chlorobium species and a Cytophaga species in the 
Bacteria (Fig. 4), it is likely that this enzyme is an ar- 
chaeal invention that was later acquired by bacteria 
through LGT. The Halobacteriales, Archaeoglobus, 
and Bacillales homologs are closely related to each 
other and distinct from other archaeal homologs and 
the Cytophaga/Chlorobium genes. The Bacillales 
GGGP synthase homologs would therefore have orig- 
inated from the Archaeoglobales or the Halobacteri- 
ales, while the Cytophaga/Chlorobium homolog would 
descend from the enzymes of some other archaeal 
group. Since no ether lipids of the sm-2,3-stereocon- 
figuration are found in bacteria, the enzymes present 
in the Bacillales and Cytophaga/Chlorobium were 
probably co-opted for a different function, possibly 
DNA replication (3, 15). Lateral transfer of GGGP 
synthase homologs is also likely to have happened 
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Figure 4. Phylogenetic analysis of the stereospecific enzymes involved in archaeal isoprenoid lipid biosynthesis. (A) sa-Glycerol-1-phosphate dehydrogenase 
(G1PDH) and related protein families; dehydroquinate synthase (DHQS), L-arabinose isomerase (AraM), glycerol dehydrogenase (GDH), alcohol dehydroge- 
nase (ADH). (B) Geranylgeranylglyceryl phosphate synthase (GGGPS). (C) Digeranylgeranylglyceryl phosphate synthase (DGGGPS) and related protein fami- 
lies; Bacteriochlorophyll/chlorophyll synthase (BchG/ChIG), homogentisic acid geranylgeranyl transferase (HGGT), 1,4-dihydroxy 2-naphtoate octaprenyl- 
transferase (MenA), heme biosynthesis farnesyltransferase (CyoE/COX10), ubiquinone biosynthetic polyprenyl transferase (UbiA/COQ2). The trees presented 
are based on maximum likelihood amino acid distances under the minimum evolution model and were obtained using PROTDIST. Bootstrap values represent 
the consensus of 100 trees obtained from pseudo-replicates of the original dataset. Taxon names of archaea are highlighted in bold. 
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among archaea, as the specific relationship found be- 
tween Archaeoglobus and Halobacteriales GGGP 
synthases (Fig. 4) is not consistent with the archaeal 
phylogeny derived from proteins involved in transla- 
tion or small-subunit ribosomal proteins (28). The in- 
congruence between well-known phylogenetic mark- 
ers and GGGP synthase suggests that the latter 
enzyme was laterally transferred between the ances- 
tors of the Halobacteriales and Archaeoglobus. 

Similar to G1P dehydrogenase, DGGGP syn- 
thase does not display a history of LGT (Fig. 4). It is 
related to multiple families of prenyltransferases (e.g., 
chlorophyll biosynthesis, heme biosynthesis) and has 
probably originated from a large prenyltransferase 
superfamily. Therefore, with the exception of GGGP 
synthase, the stereospecific enzymes of archaeal lipid 
biosynthesis appear to have been derived from a ho- 
mologous enzyme with a broadly similar function in 
the same protein superfamily. These changes of func- 
tion in archaeal proteins might have been a critical 
step in the origin of this domain of life. 


FUNCTION OF ARCHAEAL LIPIDS 


Most of the microorganisms adapted to life in 
“extreme” environments (high salinity, high or low 
pH, high or low temperature, high pressure) are mem- 
bers of the Archaea. The survival ability of many 
members of the Archaea may be attributed to their 
unique cell membranes. Indeed, membranes composed 
of ether lipids have a higher stability than those made 
of ester lipids (10, 39) (see Chapters 14 and 23). In ad- 
dition to the ether linkage between backbone and side 
chains, other features of archaeal lipids have been pro- 
posed to facilitate the growth of archaea in “extreme” 
environments. For example, cyclic lipids (such as mac- 
rocyclic diether lipids or tetraether lipids) and penta- 
cyclic rings in side chains are thought to facilitate life 
at higher temperatures. The composition of the M. jan- 
naschii cell membrane is greatly altered depending on 
its growth temperature; the proportion of macrocyclic 
and tetraether lipids increases compared with acyclic 
diether lipids when the organism is grown at higher 
temperatures (47). Cyclization of the side chains de- 
creases their freedom of motion and leads to an ap- 
propriate level of membrane fluidity at higher temper- 
atures. Pentacyclic rings are similarly believed to 
reduce membrane fluidity. In response to increased 
growth temperature, Sulfolobus solfataricus increases 
the proportion of membrane lipids that contain six to 
eight pentacyclic rings, at the expense of those con- 
taining none to two rings (12). 

Members of the Crenarchaeota which live at low 
temperature also display cyclic rings in the side-chains 


of their lipids (7). The presence of such cyclic rings 
in their membrane lipids is likely to be a consequence 
of their shared ancestry with thermophilic Crenar- 
chaeota. Given the evolutionary success of this cold- 
adapted group of archaea (i.e., they are present in 
great numbers in oceans), cyclic rings have limited ef- 
fects on growth at lower temperature and/or their im- 
pact on membrane fluidity is compensated for by other 
structural characteristics. Some of these nonther- 
mophilic marine Crenarchaeota, such as the sponge 
symbiont Cenarchaeum symbiosum, contain crenar- 
chaeol in their cell membrane; a lipid that harbors a 
hexacyclic ring in one of its side-chains (Fig. 2). This 
hexacyclic ring was thought to prevent dense pack- 
ing of lipid membranes, thereby increasing their flu- 
idity (6). However, crenarchaeol has recently been 
found in hot springs, where increased membrane flu- 
idity would have a negative impact on fitness for bac- 
terial and archaeal organisms (41). Following this dis- 
covery, it was suggested that the presence of hexacyclic 
rings might instead increase membrane porosity and 
provide some biological advantage associated with 
permeability (40). 

Archaeal lipids may contain varying degrees of 
saturation. The saturation level of ether lipid side 
chains is apparently not a purely genotypically deter- 
mined trait, as it can vary with growth conditions. In 
the methanogen Methanococcoides burtonii, the pro- 
portion of unsaturated lipids from cells grown at 4°C 
is significantly higher than for cells grown at 23°C 
(33). It has been shown for the fatty acids of bacter- 
ial cell membranes that membrane fluidity is directly 
related to the degree of unsaturation (42). Higher de- 
grees of unsaturation can maintain a more fluid mem- 
brane at lower temperatures. The presence of a high 
proportion of unsaturated lipids in M. burtonii’s cell 
membrane seems to be an adaptation to growth at 
low temperature, and its ability to regulate the pro- 
portion of such lipids in its cellular envelope would 
allow it to survive fluctuations in the temperature of 
its environment. 

Factors other than growth temperature might in- 
fluence the type of lipids found in cell membranes. 
For example, it has been suggested that the amount of 
solutes found in the environment could regulate the 
type of lipids composing the cellular membrane, 
which in turn would modify membrane permeability 
(40). The fact that C25 sesterterpanyl side chains are 
only found in alkaliphiles among the Halobacteriales 
could indicate that the thicker cell membrane formed 
by these core lipids helps them tolerate a high pH en- 
vironment (17). The composition of cell membranes 
can also vary with the growth phase. For example, in 
M. jannaschii, the ratio of tetraether lipids to diether 
lipids increases when cells progress from logarithmic 


CHAPTER 15 œ 


LIPIDS: BIOSYNTHESIS, FUNCTION, AND EVOLUTION 351 


to stationary-phase growth (47). Atmospheric pressure 
was also shown to vary the proportion of different 
lipids present in the cell membrane of this archaeon 
(19). We are just beginning to unravel environmental 
factors that influence the composition of cell mem- 
branes in archaea and to link particular lipid structures 
with their function. 


PERSPECTIVE: THE NEXT FIVE YEARS 


The enzymes catalyzing most of the steps of lipid 
biosynthesis up to the addition of head groups have 
been characterized. However, for two steps of the 
mevalonate pathway (which synthesizes the isporenoid 
precursors IPP and DMAPP; Fig. 3), the archaeal 
analogs have not been identified (PMK and PPMD; 
Table 1). It should be possible to identify the genes for 
these enzymes through complementation experiments, 
using genetically engineered Escherichia coli that har- 
bors specific genes of the mevalonate pathway (4). 

The detection of CDP-archaeol synthase and ar- 
chaetidylserine synthase activities in M. thermauto- 
trophicus suggests that the biochemical steps for the 
addition of polar head groups on archaeal lipid pre- 
cursors, might proceed in a manner analogous to 
fatty acid biosynthesis in bacteria. In bacteria, the ad- 
dition of a cytidyl group at the sn-1 position of the 
glycerol backbone is a precursor step to the addition 
of a polar head group (which replaces the cytidyl 
group). The situation might be similar in archaea, 
where CDP-archaeol synthase could perform this cy- 
tidilation step, and enzymes such as archaetidylser- 
ine synthase may subsequently add polar head groups. 
This potential analogy (and possible homology) be- 
tween bacterial and archaeal pathways could guide 
the discovery of enzymes involved in the addition of 
other important polar head groups (in addition to 
serine). Once these enzymes have been characterized, 
the door to the production of molecules such as ar- 
chaetidylglycerol or archaetidylinositol is opened. 
These have been proposed as precursors for subse- 
quent biosynthesis steps, such as the head-to-head 
condensation that produces tetraether lipids, or the 
addition of sugars leading to glycolipids (32, 48). 

Recent introduction of different combinations of 
techniques has increased our capacity to purify and 
do a structural analysis of the polar membrane lipids 
of archaea. High-performance liquid chromotography 
(HPLC) combined with evaporative light-scattering 
detection (ELSD) allowed the determination of the 
complete polar lipid composition of T. acidophila (us- 
ing gas chromatography, mass spectrometry, and NMR 
to get structural data on isolated peaks) (44). HPLC, 
in combination with electrospray mass spectrometry 


(ES-MS) was used to characterize the membrane 
phospholipids and glycolipids of halophilic archaea 
and the cold-adapted methanogen M. burtonii (using 
tandem mass spectrometry for structural data) (34, 
42). An interesting finding of the latter studies was 
that lipids with partially saturated side chains were 
present in cell membranes. Determining how this oc- 
curs may provide insight into the process of side-chain 
hydrogenation. Such lipid analyses can be used to 
look at the variation of membrane lipid composition 
under different growth conditions, as was illustrated 
for M. burtonii, in this case, where the proportion of 
unsaturated lipids was found to vary with growth 
temperature (33). This type of analysis has so far only 
been performed on a few different archaea and under 
a limited number of environmental variables (pres- 
sure and temperature). Examination of a wider range 
of taxa and environmental factors using these meth- 
ods could help to correlate lipid structure with cell 
membrane function. 
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Solute Transport 
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SOLUTE TRANSPORT IN ARCHAEA 


Living cells are surrounded by a lipid membrane 
that forms a barrier between the cytoplasm and the 
extracellular environment. These membranes are im- 
permeable for most ions and solutes, thereby enabling 
cells to control the ionic and nutrient composition of 
their cytoplasm. The membrane permeability barrier 
is essential for keeping optimal conditions for meta- 
bolic and energy-transducing reactions. Nevertheless, 
specific ions and solutes are necessary for cellular 
processes, and these molecules need to be transported 
into the cell. On the other hand, waste products need 
to be removed from the cell. Both the uptake and ex- 
cretion processes involve specialized integral mem- 
brane proteins, i.e., transport proteins. Membrane 
proteins are embedded in the lipid layer of the cyto- 
plasmic membrane. Typically, the lipid molecules form 
a bilayer, but some hyperthermophilic archaea, such 
as Sulfolobus, contain bipolar membrane-spanning 
lipids that form a lipid monolayer with the same thick- 
ness as a lipid bilayer (see Chapter 15). The fluidity 
and permeability properties of the membranes are 
mainly determined by their lipid composition. An op- 
timal fluidity of the membrane is required for the bar- 
rier function of the membrane, and also for the ac- 
tivity of the membrane proteins. Likewise, the proton 
permeability of the membrane needs to be controlled 
to permit efficient energy transduction (94). Organ- 
isms can respond to changes in the environment, such 
as temperature, salinity, or pH, and adjust the compo- 
sition of the cytoplasmic membrane by alteration of 
the lipid composition (94). Within limits, adjustment 
of the lipid composition allows homeostasis of mem- 
brane fluidity, proton permeability, and the phasic 
properties of the lipids. 
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THE CYTOPLASMIC MEMBRANE 
AND BIOENERGETICS 


Cells can generate metabolic energy by distinct 
mechanisms. One is substrate-level phosphorylation, 
whereby chemical energy released in catabolic pro- 
cesses is stored in ADP or ATP. The second mechanism 
utilizes energy-transducing systems located in the cy- 
toplasmic membrane that generate ATP or an electro- 
chemical ion gradient, which in turn is used to produce 
ATP. Electron transfer systems or membrane-bound 
ATPases are examples of such systems. They translo- 
cate protons or sodium ions across the cytoplasmic 
membrane into the medium and thereby create an elec- 
trochemical ion gradient across the membrane. When 
protons are extruded, these mechanisms yield an elec- 
trochemical gradient of protons that is composed of a 
transmembrane pH gradient (ApH) that is “inside al- 
kaline versus acidic outside,” and a transmembrane 
electrical potential (AW) that is “inside negative ver- 
sus positive outside.” Both the ApH and AY exert an 
inward-directed force on the protons, the proton mo- 
tive force (PMF). The PMF can be expressed in milli- 
volts according to the following formula: 


PMF = AW — 2.303 (RT/F) ApH 


where R is the gas constant, T is the absolute temper- 
ature (K), and F is the Faraday constant. The effect of 
1 pH unit difference between cytoplasm and external 
medium is 59 mV at 25°C, and 70 mV at 80°C. In 
most organisms, the PMF has a negative value and 
the driving force of the protons is directed into the 
cell. However, in many organisms (e.g., halophiles), 
sodium ions are used as coupling ions instead of pro- 
tons. By analogy to the PMF, extrusion of sodium 


Department of Microbiology, Groningen Biomolecular Sciences and 


Biotechnology Institute and Materials Science Center Plus, University of Groningen, Kerklaan 30, 9751 NN Haren, The Netherlands. 


CHAPTER 16 © SOLUTE TRANSPORT 355 


ions from the cytoplasm into the medium results in 
the formation of an electrochemical sodium gradient 
that exerts a sodium motive force (SMF). The latter 
is composed of a AV and the transmembrane chemi- 
cal gradient of the sodium ions: 


SMF = AW + 2.303 RT/F log[Nai|/[Na gurl 


The PMF and SMF can be used for energy-requiring 
processes such as ATP synthesis, substrate transport 
across the membrane, flagellar rotation, and mainte- 
nance of the intracellular pH. Evidently, these pro- 
cesses can take place only if the electrochemical gra- 
dients of protons and sodium ions remain intact. This 
is only possible if the cytoplasmic membrane has a 
limited permeability for these ions. 

Extreme acidophiles such as Picrophilus and Sul- 
folobus maintain an intracellular pH that is close to 
neutrality. As a consequence, these organisms experi- 
ence a very high ApH across their cell membrane. This 
ApH can be up to 4 pH units, which it is for Pi- 
crophilus (82, 93). Such a high ApH can only be main- 
tained with a membrane that has a very low proton 
permeability. On the other hand, the very high ApH 
in these archaea is partially compensated by an in- 
verted AW (negative outside versus positive inside) to 
keep the PMF within viable values (63). The inversion 
of the AV is mainly realized by the uptake of potas- 
sium ions. Alkalophiles also maintain their intracellu- 
lar pH close to neutrality. Thus, in these organisms, the 
ApH is reversed, i.e., alkaline outside versus acidic in- 
side. To keep the PMF at viable values, these organisms 
maintain a large AY (positive outside versus negative 
inside) across their membrane. 


LIPIDS IN ARCHAEAL 
MEMBRANES AND ADAPTATIONS 
TO ENVIRONMENTAL CHANGES 


In contrast to eucaryal and bacterial ester lipids, 
in archaea membrane lipids consist of two phytanyl 
chains that are linked via an ether bond to a glycerol 
or other alcohols like nonitol (see Chapter 15). The 
structure of the archaeal lipids has been reviewed ex- 
tensively (18, 41, 100). The archaeal lipid chain con- 
tains isoprenoid units where every fourth carbon atom 
is linked to a methyl group. Most of the archaeal phy- 
tanyl chains are fully saturated isoprenoids (18, 40, 
48, 100). Halobacteria and most mesophilic archaea 
contain lipids with a C2ọ diether lipid core, which 
form a bilayer membrane similar to ester lipids (40, 
42, 91). In extreme thermophiles and acidophiles 


monolayer-forming tetraether lipids are found which 
consist of C49 isoprenoid acyl chains (18). 

Archaea respond to changes of the environmen- 
tal temperature by the adaptation of the lipid com- 
position of the cytoplasmic membrane, and as dis- 
cussed above, these changes are necessary to keep the 
membrane in a liquid crystalline state and to limit its 
proton permeability. In Thermoplasma and Sulfolobus 
solfataricus lipids the degree of cyclization of the C49 
isoprenoid in the tetraether lipid increases with the 
growth temperature (17, 23, 55). In the euryarchaeote 
Methanococcus jannaschii, an increase in temperature 
induces a change from diether lipids to tetraether lipids, 
which are more thermostable (86). Cyclization of the 
C49 isoprenoid chains and the use of tetraether lipids 
results in a tighter packing of the lipid acyl chains and 
a restricted mobility of the lipids in the membrane. 
This prevents an intolerable increase of membrane flu- 
idity and enables these archaea to tolerate elevated 
growth temperatures. The arctic archaeon Methano- 
coccoides burtonii employs unsaturation of ether lipids 
to adapt to low-growth temperatures. The fraction in 
the membrane of unsaturated lipids was significantly 
higher at 4°C than at 23°C (68). The unsaturation ap- 
pears to be caused by an incomplete reduction of the 
lipid precursor instead of by a desaturase mechanism 
(68). In marine crenarchaeota a novel tetraether lipid 
(crenarchaeol) was found. The characteristic glycerol 
dibiphytanyl glycerol tetraether (GDGT) membrane 
lipid contains one cyclohexane and four cyclopentane 
rings formed by internal cyclization of the biphytany] 
chains. Its structure is similar to that of GDGTs bio- 
synthesized by (hyper)thermophilic crenarchaeota apart 
from the cyclohexane ring (15). The degree of cycliza- 
tion of the core of the lipid increased within a temper- 
ature range from 0 to 30°C (84). 


GENERAL PRINCIPLES OF TRANSPORT: 
MECHANISMS AND CLASSES 


Solute transport systems can be classified in dif- 
ferent groups according to their molecular architec- 
ture and their energy requirement (Fig. 1). Five classes 
of transport system have been identified: (i) channels, 
including the well-known examples of mechanosen- 
sitive channels, which can exhibit either large or small 
conductance; (ii) secondary transporters, which make 
use of the electrochemical gradient of either protons, 
sodium ions, or substrates to drive transport of sub- 
strates across the membrane. Secondary transporters 
are subdivided into three classes: (1) uniporters, which 
transport solutes without a coupling ion; (2) sym- 
porters, which translocate solutes together with a cou- 
pling ion, such as protons or sodium ions; and (3) anti- 
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Figure 1. Classes of transporters. (A) Channels and different modes of secondary transporters. (B) ABC transporters and the 
PTS system. Integral membrane subunits of transporters (filled); cytoplasmic or extracellular subunits (open oval); trans- 
ported substrate (open circle). The names of the subunits of the PTS system refer to the mannitol transporter of E. coli. 


porters, which transport two substrates/ions in opposite 
directions. Secondary transporters usually consist of 
one polypeptide chain containing 4 to 14 transmem- 
brane segments; (iii) binding-protein-dependent sec- 
ondary transporters (TRAP transporters), which con- 
sist of a periplasmic binding protein and a membrane 
protein. These systems use the PMF or SMF to drive 
uptake of solutes; (iv) group translocation systems, 
i.e., the phosphoenolpyruvate (PEP)-dependent phos- 
photransferase systems (PTS), which couple transport 
of sugars to phosphorylation; and finally (v) primary 
transporters, which use light or chemical energy such 
as ATP or other compounds to drive substrate translo- 
cation. Well-studied examples are ion-translocating 
respiratory chains, the light-driven proton pump (bac- 
teriorhodopsin), various types of ion-translocating 
ATPases, and the ABC (ATP-binding cassettes) trans- 


porters. ABC transporters have a typical modular do- 
main structure which usually comprises two integral 
membrane proteins that form the permease domain, 
and two cytoplasm-located ATPases which drive the 
transport of the substrate by the hydrolysis of ATP. In 
contrast to bacteria and archaea, ABC transporters in 
eucarya usually consist of a single polypeptide that 
encompasses all four domains. However, in bacteria 
and archaea, the membrane and ATPase domains 
may either exist as homo- or heterodimers, or even 
as separately fused membrane or ATPase domains. 
The ABC export systems are often homo- or hetero- 
dimers of a membrane domain fused with an ATPase 
domain. Bacterial and archaeal ABC uptake systems 
comprise a fifth subunit which is an extracellular sub- 
strate-binding-protein that binds the substrate at the 
outside of the cell and delivers it to the permease. 
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DISTRIBUTION OF TRANSPORT 
SYSTEMS IN ARCHAEA 


Archaeal membrane proteins have only rarely 
been studied. The exception is bacteriorhodopsin, 
which has been studied in great detail by various bio- 
chemical, structural, and biophysical methods. This 
light-induced proton pump was first isolated in the 
early 1970s from membranes of Halobacterium halo- 
bium and H. cutirubrum (53, 70). The protein exists 
in a semicrystalline state in the purple membrane; 
membranes are purple due to the color of bacterio- 
rhodopsin. The ease of its purification from purple 
membranes has made it a very suitable membrane sys- 
tem to study. Bacteriorhodopsin contains a nonprotein 
cofactor; the retinal that absorbs light is mechanisti- 
cally involved in the proton-pumping mechanism. The 
structural and molecular mechanism of light-induced 
proton pumping has been elucidated and discussed in 
many excellent reviews (21, 32, 66). 

During recent years, the genomic sequences of 
a large number of Archaea have been determined. 
The transport proteins were identified in 18 archaeal 
genome sequences (72) and described in a relational 
database (78) (see TransportDB at http://www.mem 
branetransport.org/). All classes of transporters can 
be found in archaea except for PTS systems which are 
completely absent. Strikingly, PTS systems are also 
absent in the hyperthermophilic bacteria Thermotoga 
maritima and Aquifex aeolicus that deeply branch in 
the universal phylogenetic tree, indicating that PTS- 
systems may have arisen relatively late in evolution. 

The number of transporters present in the differ- 
ent archaeal genomes can be expressed as the number 
of transporters per Mb of genome. The value for this 
parameter varies from 21.3 in Methanopyrus kandleri 
to 73 in Picrophilus torridus and Thermoplasma vol- 
canicum, and in most archaea it is about 40. Because 
there are no PTS systems present in archaea, only four 
classes of transporters exist, i.e., ion channels, ATP- 
dependent transporters and secondary transporters, 
and TRAP transporters, which so far have been iden- 
tified in only three archaeal species. The ion channels 
constitute the smallest class and amount to only 5 to 
8% of all transporters. Only M. kandleri and Nano- 
archaeum equitans are exceptions, with 14% and 
19% ion channels, respectively. Whereas N. equitans 
contains two small conductance mechanosensitive ion 
channels, M. kandleri encodes three candidates of 
voltage-gated ion channels. On average, about 50% 
of the transporters belong to the class of secondary 
transporters, while the remainder (~37%) are ATP- 
dependent transport systems. The ATP-dependent 
transport systems are the predominant class, only in 


Aeropyrum pernix, N. equitans, and Pyrobaculum 
aerophilum. In A. pernix and P. aerophilum this is 
mainly caused by a high number of ATP-binding cas- 
sette (ABC) transporters, whereas in N. equitans 
(which has a small genome), 5 of the 8 genes belong to 
an F-type ATPase. In this organism, only three ABC 
transport genes are found, of which only one encodes 
a protein with a predicted membrane domain. This 
suggests that the other two, which have ATP-binding 
domains but not membrane domains, have other, 
maybe cytoplasmic, functions. 

In the group of acidophiles (P. torridus, S. solfa- 
taricus, Sulfolobus tokodati, Thermoplasma acido- 
philum, and T. volcanicum), the fraction of secondary 
transporters (~68%) is very large. The genomes of 
these organisms encode a multitude of members of 
the amino acid-polyamine-organocation (APC) fam- 
ily of secondary transporters. These systems mostly 
function as solute: cation symporters or solute: solute 
antiporters (77). Moreover, the analysis of all ar- 
chaeal genome sequences indicates that secondary 
transporters of the NiCoT (Ni**/Co?*) and Nramp 
(metal-ion transporter) family are only found in aci- 
dophiles. These systems are often involved in the up- 
take of heavy metals or iron. Due to the selective 
pressures that metal-rich acidic environments pose, 
acidophiles need to be resistant to heavy metals and 
some of these transporters might be involved in the 
resistance mechanisms. 


Channels 


Mechanosensitive (MS) channels have been ex- 
tensively studied in eucarya and bacteria (30). In 
eucarya, MS channels play an important role in or- 
gan functions, such as hearing, touching, and in cell 
swelling. In bacteria, MS channels mainly play a role 
in the survival of osmotic stresses experienced under 
hypotonic conditions. The physiological function of 
MS channels in archaea has not been studied, but it 
is likely to be similar to bacteria. Interestingly, the 
growth of an Escherichia coli strain that lacks the en- 
dogenous MS channels can be partially restored by 
expression of the MS channel of M. jannaschii. How- 
ever, this can only occur in a medium with high os- 
molarity (45). MS channels were first described in 
Haloferax volcanii (56), and systems from T. acido- 
philum, T. volcanicum, and Methanocaldococcus jan- 
naschii were cloned and characterized (44—46). These 
channels share many mechanistic features, such as gat- 
ing by osmotic stress, a large conductance and low ion 
selectivity, and weak voltage dependence. The two MS 
channels from M. jannaschii, MscMJ and MscMJLR, 
differ significantly in conductance (0.3 nS and 2.0 nS, 
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respectively) (46), but they appear highly selective for 
cations. In this respect, these channels functionally re- 
semble the Ca?*-permeable channels of skeletal and 
heart muscle (60). 

Two classes of ion channels can be discriminated 
that are opened either by ligand binding (e.g., neuro- 
transmitters) or by the transmembrane potential (volt- 
age-gated channels). Structural information for both 
classes of channels has been gathered from archaeal 
counterparts, i.e., the calcium-gated K*-channel, 
MthK, of Methanothermobacter thermautotrophicus 
(36) and the voltage-dependent K*-channel, KvAP, of 
A. pernix (37). 

MthkK is a tetramer, and each subunit contains 
two integral membrane domains and one C-terminal 
RCK (regulators of K* conductance) domain, which 
is located in the cytoplasm. The RCK domains form 
a gating ring. When Ca* binds, the pore of the chan- 
nel opens to allow K* to be transported (Fig. 2). The 
voltage-dependent K *-channel from A. pernix, KvAP, 
was first shown to be a functional and structural ho- 
molog of eucaryal voltage-gated K*-channels (79). 
Subsequent structural studies revealed the presence of 
“voltage-sensor paddles” (37, 38), which may carry 
the charge across the membrane along the inface of 
the transporter and the lipid membrane to open the 
channel. However, this model (37, 38) is being de- 
bated as it opposes the conventional model in which 
the charge is transported along a helix in the core of 
the protein. Recent data suggest that the movement 
of the sensor paddles might only be very small (11, 78) 
and insufficient to carry ions across the membrane. 


out 


Secondary Transporters 


Although secondary transporters are the most 
abundant class in archaeal genomes, few studies have 
been performed on members of this class. The earliest 
report on secondary transporters in archaea is from 
1977 and describes the characterization of the uptake 
systems for amino acids in membrane vesicles of 
H. halobium in response to a light-induced Aw and an 
artificially imposed sodium-ion gradient (59). This 
study reports the kinetics of all amino acid trans- 
porters, except for cysteine and aspartate. The aspar- 
tate transporter was purified from membranes of 
H. halobium and reconstituted in proteoliposomes 
(26). This system utilizes the sodium-ion gradient to 
drive transport and is highly specific for L-aspartate 
(26). A sodium-dependent glucose transporter has 
been described in H. volcanii (89). Expression of this 
system was shown to be induced by growth of the cells 
in the presence of glucose. Transport of glucose was 
blocked by inhibitors of mammalian glucose trans- 
porters (89), suggesting common structural and func- 
tional features of both systems. 

A lactose transporter was identified in S. solfa- 
taricus by functionally complementing a mutant 
strain that was unable to grow on lactose (9). Com- 
plementation required both the lacS gene (8-galac- 
tosidase) and a gene located upstream of lacS that 
encodes a secondary transporter (73). The lactose 
transporter belongs to the major facilitator family 
(MFS) and is predicted to contain 12 transmembrane 


segments. 


Figure 2. Channel structures. The calcium-gated Kt-channel, MthK of M. thermautotrophicus, and the secondary transport 
structure of the glutamate transporter, Gltp, of P. horikoshii, are shown. The orientation of both transporters in the mem- 
brane is depicted. The membrane-inserted part of MthK is relatively small, in comparison with the large complex of 
cytoplasmically located RCK domains. The structure of Gltp, shows the large cavity directed toward the extracellular side 


where the substrate is bound. 
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Methanosarcina thermophila TM-1 and Methan- 
ohalophilus portucalensis strain FDF1 were both 
shown to actively accumulate compatible solutes such 
as glycine-betaine. The glycine-betaine transporter of 
M. thermophila is highly specific for glycine-betaine 
and transports this molecule with a high affinity (K, 
of 10 uM) (74). Studies with inhibitors suggest that 
transport depends on a PMF or SME. However, since 
the gene(s) have not yet been identified, the exact 
mechanism remains to be determined. In M. portu- 
calensis, growth on trimethylamine in the presence 
of 2.1 M NaCl was dramatically stimulated by the 
presence of glycine-betaine (54). This transporter is 
also specific for glycine-betaine and transports the 
molecule with a K, of 23 uM (54). 

Recently, the structure of a putative glutamate 
transporter from Pyrococcus horikoshii was eluci- 
dated (101). This is the first structural report of a sec- 
ondary transporter from a member of the Archaea. 
This protein is homologous to high-affinity neuro- 
transmitter transporters of the synaptic cleft in mam- 
mals. Although active transport of glutamate could 
not be demonstrated, the structure suggests a trans- 
port mechanism for this trimeric system that involves 
a large hydrophilic pore exposed to the outside that 
easily accommodates the substrate (Fig. 2) (101). 


ABC Transporters 


ABC transporters are a major class of trans- 
porters in archaea, and several systems have been 
studied in a great detail. Similar to bacterial counter- 
parts, the subunits of these transporters exhibit the 
typical consensus motifs that classify them as ABC 


transporters. The ATPase subunits contain the typical 
Walker sequences, the Q-loop and the H-region. Ar- 
chaeal ABC transporters can be divided into at least 
two classes, the binding-protein-dependent uptake sys- 
tems and the multidrug/antibiotic transporter, which 
probably function as exporters. The latter class is typ- 
ically composed of two subunits, each with one in- 
tegral membrane part and a cytoplasmic nucleotide- 
binding domain. In some cases, these two domains of 
multidrug transporters are fused into a single polypep- 
tide chain. In addition to the two permease and cyto- 
plasmic ATPase domains, which form either homo- 
or heterodimers, the binding-protein-dependent ABC 
transporters constitute a substrate-binding protein. 
The membrane domains contain the EAAAx3Gx9IxLP 
motif, which has been shown to be the site of inter- 
action between the membrane domain and the cyto- 
plasmic ATPase subunit (16). Table 1 lists the ABC 
transporters in archaea that have been studied in 
some detail. ABC sugar transporters were identified 
and characterized in Thermococcus litoralis, Pyrococ- 
cus furiosus, and S. solfataricus. Transport of osmo- 
protectants has been studied primarily in methanogens 
(Table 1). 

Bacterial sugar ABC uptake systems are divided 
into two main classes: the carbohydrate uptake trans- 
porters (the CUT class) and the di/oligopeptide trans- 
port class (83). The archaeal systems involved in 
the uptake of mono- or disaccharides belong to the 
CUT class. However, ABC transporters that transport 
di- and oligosaccharides, such as the cellobiose/ 
B-glucoside transport system of P. furiosus (50) and 
the maltose/maltodextrin transporter of S. solfataricus 
(22), are homologous to the di/oligopeptide class. The 


Table 1. Characterized ABC transporters in the Archaea 


ABC transporter Substrate K, for uptake (nM) Kı for solute binding’ (nM) Reference 

T. litoralis Maltose/trehalose 22/17 160 34 

S. solfataricus Glucose 2,000 480 2 
Cellobiose + cellooligomers b - 22 
Trehalose - - 22 
Maltose/maltotriose - - 22 
Arabinose - 130 22 

P. furiosus Cellobiose + cellooligomers 175 45 50 
Maltose/trehalose 51 
Maltotrios, maltodextrin - 270 24, 51 

H. volcanii Glucose (anaerobic) - - 98 
Molybdate - - 98 
Inorganic anions = = 98 

A. fulgidus Glycine betaine, proline betaine - - 33 

M. mazei Gol Glycine betaine - - 18 

M. portucalensis Glycine betaine 23,000 - 54 

M. thermophila TM-1 Glycine betaine 10,000 - 70 


“Solute binding to binding protein. 
’_, not determined. 
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similarity of the two systems includes the substrate- 
binding proteins which have shown homology on the 
primary sequence level to bacterial di/oligopeptide- 
binding proteins, through to the subunit composition 
of the transporter, which consists of two permeases 
and two cytoplasmic ATPases. By analogy to bacter- 
ial di/oligopeptide-binding proteins, the homologous 
archaeal sugar-binding proteins recognize a large range 
of oligosaccharides (50). Operons encoding subunits 
of putative di/oligopeptide transporters are highly 
abundant in hyperthermophilic archaea and bacte- 
ria. Strikingly, these operons are positioned in the 
vicinity of genes that encode sugar-degrading enzymes 
(65). In Thermotoga maritima, some of these trans- 
porters were shown to be upregulated when the cells 
are grown on specific sugar substrates (14). It is ad- 
vantageous for archaea and bacteria to accumulate 
shorter sugar oligomers in a single transport step as 
the uptake of a sugar monomer and oligomer pre- 
sumably takes similar amounts of ATP. Uptake of 
oligomers thus minimizes the overall energy require- 
ments for the uptake processes. 

In H. volcanii, the function of several ABC trans- 
porters has been studied by using genetics. Comple- 
mentation studies of mutants unable to grow under 
nitrogen limitation yielded three ABC transporters 
that are involved in the transport of glucose, molyb- 
date, or inorganic ions (98). In a recent study, an ABC 
transporter for a corrinoid (vitamin B42 precursor), 
was genetically characterized in Halobacterium sp. 
strain NRC-1 (99). 

The best-studied example of an archaeal ABC 
uptake system is the trehalose/maltose transporter of 
T. litoralis. The trehalose/maltose-binding protein 
(TMBP) binds its substrate with a very high affinity 
(160 nM at 80°C) (34); transport is also a high-affin- 
ity process. The ATPase subunit of this transporter, 
MalK, was expressed in E. coli and characterized (27), 
the structure was elucidated for both MalK and TMBP 
(19, 20), and the entire transporter was heterologously 
expressed in E. coli and purified (28). Although the 
solubilized and purified transporter displayed an in- 
trinsic ATPase activity, the activity was not stimulated 
by the addition of binding protein, in contrast to stud- 
ies with bacterial systems (76). 


The substrate-binding protein 


The binding protein captures the substrate at the 
outside of the cell and delivers it to the membrane- 
embedded permease domains, whereupon it is translo- 
cated across the membrane. Substrate-binding pro- 
teins contain two large domains (lobes) that are linked 
by a flexible hinge region. Upon binding of the sub- 
strate, the two lobes close like a “venus-fly trap” mech- 


anism (75). The structures have been solved for TMBP 
of T. litoralis (20), the maltodextrin-binding protein 
(PfuMBP) of P. furiosus (25), and ProX, the glycine- 
betaine- and proline-betaine-binding protein of Ar- 
chaeoglobus fulgidus (81). Both TMBP and PfuMBP 
have high structural similarity to the maltose-binding 
protein of E. coli (EcMBP), even though the primary 
amino acid sequence identity is low. Structurally equiv- 
alent, but not identical, amino acids are involved in 
sugar binding in TMBP and the EcMBP (20). In all of 
these binding proteins, two elongated patches of hy- 
drophobic amino acids outline the binding pocket. 

No significant heat release or absorption occurs 
when maltose binds to PfuMBBP, in contrast to the en- 
dothermic binding of maltose to EcMBP. Both en- 
zymes have the same binding constant. On the other 
hand, binding of maltotriose to PfuMBP was strongly 
exothermic and occurs with an affinity (Ky ~ 2.7 X 
107 M7!) that is 20-fold higher than observed for mal- 
totriose binding to the ECMBP (Kz ~ 1.3 X 10° M7!) 
(25). In comparison with EcMBP, the bound sugar 
seems less deeply buried in the binding cleft of PfuMBP. 
The latter structure is much less flexible, indicating 
that the sugar-bound form occurs more in an “open” 
than in an “occluded” state. It has been suggested that 
this binding mode resembles more a “lock-and-key” 
mode than the “venus-fly trap” mode (25). 

ProX from A. fulgidus binds glycine-betaine and 
proline-betaine with a Kz of 60 and 50 nM at room 
temperature, respectively (33). These compounds act 
as thermoprotectants in this organism. The structure 
of ProX was determined without an apo form and in 
complex with glycine-betaine, proline-betaine, and 
trimethylammonium (81). Upon binding of these com- 
pounds, the two lobes move closer to each other. The 
binding of the substrates was mediated by cation-pi 
interactions and nonclassical hydrogen bonds between 
four tyrosine residues, a main-chain carbonyl oxygen, 
and the substrates. The mechanism of binding of the 
ligands occurs in a manner that is similar to the E. coli 
homolog. However, the A. fulgidus and E. coli pro- 
teins have low sequence identity, and the residues in- 
volved in binding are structurally equivalent but not 
conserved (81). 

Structural studies on archaeal substrate-binding 
proteins have been performed with heterologously ex- 
pressed proteins from E. coli. A common feature of 
the substrate-binding proteins and other extracellular, 
especially membrane-bound, proteins from archaea is 
that they are extensively glycosylated in their native 
host (24, 28, 31, 88) (see Chapter 11). Binding proteins 
isolated from P. furiosus contain glucose moieties (51), 
whereas mannose, glucose, galactose, and N-acetylglu- 
cosamine have been identified in the binding protein 
of S. solfataricus (22). Due to the extensive glycosyla- 
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tion, the archaeal binding proteins are readily isolated 
from solubilized membrane fractions by lectin affinity 
chromatography (2, 22, 28, 50). However, glycosyla- 
tion does not appear to be essential for functionality 
as the heterologous, nonglycosylated proteins normally 
bind sugar (20, 25, 34, 50, 51, 81). Glycosylation may 
affect the stability of the proteins, protect them from 
proteolytic degradation, or influence their interaction 
with the cell envelope. 

The distinction between the CUT and di/oligo- 
peptide classes of ABC transporters is also evident for 
the protein-domain organization of the binding pro- 
teins (Fig. 3A and B). Binding proteins belonging to 
the CUT class contain an N-terminal signal peptide 
followed by a stretch of hydroxylated amino acids 
that is up to 60 residues long (the ST linker). The ST 
linker might act as a flexible region between the mem- 
brane anchor and the binding domain. ST linkers 
were also identified in a pullulanase (24) and a haloar- 
chaeal S-layer protein (88). In the latter case, this re- 
gion was the site of O-linked glycosylation. 

As archaeal cells are surrounded by an S layer 
with pores of 4 to 5 nm size (47) (see Chapter 14), 
small molecules and proteins can easily diffuse into 
the medium. Therefore, extracellular proteins need 
to be attached to the archaeal cell envelope. In gram- 
positive bacteria, sugar-binding proteins are attached 
to the membrane by a lipid anchor that is covalently 
attached to the protein after translocation across the 
membrane. In bacteria, a cysteine residue at the +1 
position is first lipidated prior to signal sequence re- 
moval by a specific lipoprotein signal peptidase (see 
Chapter 17). However, archaeal genomes do not ap- 
pear to contain a homolog of the bacterial enzyme, 
and it is currently unclear how lipidation takes place 


A 


di/oligo-peptide 
cluster 


sugar cluster 


in archaea. Nevertheless, the N termini of many eury- 
archaeal binding proteins contain the same “lipo- 
box” motif as bacterial proteins, a cysteine residue 
at the +1 position that is preceded by small hydro- 
phobic residues (4). Analysis of the halocyanin of the 
archaeon, Natronobacter pharaonis, by mass spec- 
troscopy, revealed that the N-terminal cysteine residue 
is covalently linked to a Cy9 diphytanyl diether lipid 
(61) (see Chapter 15). 

Members of the CUT class of binding proteins 
from S. solfataricus contain type IV pilinlike signal 
peptides and do not contain secretory signal peptides 
(2, 5,22). The signal peptides are usually found in the 
subunits of bacterial pili and archaeal flagellins (90), 
and typically consist of a positively charged N termi- 
nus followed by a hydrophobic domain (see Chapter 
17). The hydrophobic domain acts as a scaffold for 
the assembly of the translocated subunits into multi- 
meric structure. However, this process requires the 
positively charged N terminus to be removed by a 
type IV pilinlike signal peptidase. Because of the pres- 
ence of type IV pilinlike signal peptides, it has been 
hypothesized that the binding proteins of S. solfatar- 
icus multimerize into a structure, referred to as the 
“bindosome” (1). In vitro assays demonstrated that 
the signal peptides of these binding proteins are pro- 
cessed by a specialized type IV prepilin signal pepti- 
dase, PibD, and not by the typical signal peptidase I 
(6). This enzyme also processes the flagellin subunits 
before assembly into the flagellum (7). 

The binding proteins of the di/oligopeptide class 
contain typical bacterial signal peptides that are pro- 
cessed by an archaeal homolog of signal peptidase I 
(8, 67). In contrast to the CUT class binding proteins, 
these proteins harbor a hydrophobic region at the 
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Figure 3. Mode of anchoring of archaeal substrate-binding proteins and their domain structure. (A) Archaeal substrate-binding 
proteins are either bound to the membrane by a fatty acid modification of their N terminus or a hydrophobic domain at the 
C or N terminus. (B) Domain organization of substrate-binding proteins. N, N terminus; C, C terminus; SS, signal sequence; 
ST-linker, serine/threonine-rich amino acid stretch; filled circle, substrate of binding protein. 
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C terminus that is preceded by a ST linker. Therefore, 
it appears that the CUT class binding proteins have 
a domain organization that mirrors the di/oligopep- 
tide class sugar-binding proteins. Because of the pres- 
ence of the C-terminal hydrophobic domain, which 
is likely to function as a transmembrane domain to an- 
chor these proteins to the cytoplasmic membrane, the 
di/oligopeptide class binding proteins do not need to be 
lipid modified at their N termini. However, some pyro- 
coccal binding proteins contain a GGICG sequence 
motif immediately downstream of the hydrophobic 
domain at their C terminus. Similar to some halophilic 
S-layer proteins that contain this motif, C-terminal 
lipid modification (52) may occur in a similar manner. 

Many archaea contain binding proteins of the 
CUT class (4). S. solfataricus is an exception that 
has more binding proteins from the di/oligopeptide 
class (rather than the CUT class), and methanogens 
do not contain any member of the di/oligopeptide 
class. This might imply that S. solfataricus grows on 
carbon sources derived from higher oligomeric sub- 
strates without the need of extracellular enzymatic 
“pre-digestion,” whereas methanogens do need to 
enzymatically cleave substrates before they can be 
transported into the cell. 

Halobacterium contains a class of binding pro- 
teins that function as receptors in chemotaxis rather 
than in substrate uptake (49) (see Chapter 18). Al- 
though some bacterial substrate-binding proteins are 
also involved in extracellular substrate concentration 
sensing, their main function is transport (87). In gen- 
eral, substrate-binding proteins cluster in operon-like 
structures together with the other structural genes of 
the ABC transporter. However, in Halobacterium, the 
substrate binding-proteins, BasB (branched amino 
acids) and CosB (compatible solutes), are found in 
transcriptional units with their cognate transducer 
protein (49). 


The ATP-binding protein 


ATP-binding proteins of ABC transporters pro- 
vide the driving force for substrate translocation. These 
proteins typically exist as homo- or heterodimers. 
Binding of ATP stabilizes the dimeric form of the iso- 
lated subunits, whereas hydrolysis of ATP causes the 
destabilization of the dimer. In short, upon ATP bind- 
ing, the conformational change in the ATPase dimer 
affects the permease domain and results in a confor- 
mational change of the membrane domains and the 
passage of the substrate. ABC ATPases are highly con- 
served proteins in all three domains of life and share 
several motifs, such as the Walker A/B motifs (98) and 
the ABC signature motif, LSGGQ (10). The residues of 
the Walker motifs have been shown to be important 


for binding of ATP and hydrolysis, whereas the ABC 
signature motif is essential for the dimerization process 
following ATP binding (64). 

Many archaeal ATP-binding proteins have been 
crystallized, such as LolID (MJ0769) and LivG (MJ1267) 
from M. jannaschii (102), MalK from the trehalose/ 
maltose transporter of T. litoralis (19), and GlcV, the 
ATPase subunit of the glucose transporter of S. solfa- 
taricus (95). Studies on LivG showed that the ABC 
signature motif is essential for dimer formation. Dimer- 
ization results in the completion of the catalytic ATP- 
binding site, as both monomers contribute residues 
for the active site (64). Although the ABC ATPase 
dimer contains two nucleotide-binding sites, hydroly- 
sis of ATP at only one of the sites appears sufficient for 
transport (96). The overall fold of the archaeal ATP- 
binding domains is structurally the same as that of bac- 
terial and eucaryal proteins. 

MalK and GlcV contain an additional domain 
at their C-terminus (Fig. 4) that predominantly con- 
sists of B-sheets that are organized in an OB (oligo- 
nucleotide/oligosaccharide) fold. This C-terminal ex- 
tension is also present in MalK; the ATPase of the 
E. coli maltose ABC transporter (12). This domain in- 
teracts with a positive regulator of the mal operon, 
MalT (71). When MalT is bound by MalK, activation 
of the mal operon cannot occur. However, when mal- 
tose is present in the medium, ATP is hydrolyzed by 
MalK, and MalT is released into the cytosol, where it 
can activate the transcription of the mal operon. This 
is the only example of an ATPase subunit that is in- 
volved in the regulation of expression of its own 
operon. MalT homologs have not been identified in 


ATP-binding domain 


C-terminal domain 


Figure 4. Structure of the ATP-binding subunit, GlcV, of the glu- 
cose ABC transporter of S. solfataricus. The ATP-binding domain 
contains all the necessary residues for ATP hydrolysis. Bound ATP is 
shown in the structure (ball model). The function of the C-terminal 
domain is unknown. 
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archaea, and at this stage it is unclear if the C-termi- 
nal extension of the archaeal ABC ATPases are in- 
volved in regulation. Expression of the trehalose/mal- 
tose transporter of T. litoralis is repressed by the 
negative regulator, TrmB (57) (see Chapter 6). In P. 
furiosus, TrmB regulates both the trehalose/maltose 
transporter and the maltodextrin transporter (51, 
58). This argues against the presence of a MalT-like 
regulation system in archaea. DNA microarray stud- 
ies in P. furiosus revealed that the 16-kb region in the 
genome that is nearly identical with a portion of the 
T. litoralis genome showed that all the transporter 
genes are upregulated during growth on maltose, 
whereas malK and trmB retain high levels of expres- 
sion, even when cells are grown on peptides. This im- 
plies that there might be a regulatory role for MalK 
beside the direct regulation of the operon by TrmB, 
but how this regulation is conferred remains to be 
investigated. 


Comparison of ABC transporters 
present in Sulfolobus species 


The genome sequences of three Sulfolobus species 
have been completed: S. tokodaii, S. solfataricus (43, 
86), and S. acidocaldarius (13). S. solfataricus contains 
28 ABC transporters, while S. tokodaii and S. acido- 
caldarius contain only 18 and 13 systems, respectively 
(Table 2). The larger number of ABC transporters in 
S. solfataricus can be mainly attributed to a subset of 
binding-protein-dependent ABC transporters. Of the 
three Sulfolobus species, S. solfataricus can utilize the 
largest number of carbon sources for growth. It can 
grow heterotrophically in a basal salt medium on a 
wide range of sugars, including glucose, cellobiose, 
maltose, arabinose, lactose, and others (22, 29). This 
diversity of carbon source utilization is reflected by 
the diversity of ABC transporters. Substrates of five 
ABC transporters in S. solfataricus have been identi- 
fied (2, 22). In S. tokodaii, only a close homolog of the 
cellobiose transporter is present, whereas no homologs 
of the arabinose, glucose, maltose, or trehalose trans- 
porters have been identified either in S. tokodaii or 
S. acidocaldarius. 

As described above (see “The substrate-binding 
protein”), a group of substrate-binding proteins of 
S. solfataricus contain type IV pilinlike signal peptides 
(5, 22). Most of the ABC transporters with a binding 
protein with type IV pilin signal peptides are not pre- 
sent in the other two strains. However, the predicted 
sugar transporter in S. solfataricus SSO1171-SSO1168 
has homologs in the other strains, and these binding 
proteins all possess these unusual signal peptides. 
Moreover, the homologs of the predicted di/oligopep- 
tide transporter SSO2619-SSO2616, Saci1038-1034 


and Sto2539-2543, both contain binding proteins 
with type IV pilin signal peptides, whereas the equiv- 
alent transporter from S. solfataricus contains a typi- 
cal type I signal peptide. All three strains contain a 
homolog of the type IV pili signal peptidase, PibD, 
which has been described and characterized from 
S. solfataricus. However, in S. acidocaldarius the pep- 
tidase has not been correctly annotated. The sub- 
strate-binding proteins with type IV pilin signal pep- 
tides appear to be unique to Sulfolobales. By analogy 
to the type IV signal sequence containing proteins, fla- 
gella and pili, their role in Sulfolobus species is likely 
to enable the binding proteins to assemble into multi- 
meric structures on the membrane or S-layer surface. 


MULTIDRUG TRANSPORTERS IN ARCHAEA 


A general feature of bacteria and archaea is that 
they are equipped with defense systems against toxic 
compounds from the environment. This protection 
mechanism involves multiple drug transport systems 
that extrude toxic compounds from the cell. In gen- 
eral, the drug transport systems belong to the class 
of secondary transporters or ABC transporters. A 
P-glycoprotein-like drug efflux pump was described 
in H. volcanii (62). An anthracycline-resistant mutant 
of H. volcanii showed an improved ability to extrude 
a range of drugs (e.g., Rhodamine123), compared 
with wild-type cells (39). Similar to the human mul- 
tidrug transporter, P-glycoprotein, the drug transport 
in the H. volcanii mutant could be reversed by the ad- 
dition of an uncoupler, FCCP, or the Ca(2+)-channel 
antagonist, nifedipine (NDP) (62). This indicated that 
the improved resistance was mediated by an ABC 
transporter (62). 

Another multidrug transporter has been charac- 
terized in H. salinarum (69). Hsmr is a small trans- 
porter that belongs to the group of small multidrug re- 
sistance proteins that are typically only 110 amino 
acids in length. These systems function as drug/proton 
antiporters. Hsmr was expressed in E. coli and was 
shown to actively extrude ethidium from cells. Deter- 
gent-solubilized Hsmr binds the substrate tetraphenyl- 
phosphonium (TPP*) with high affinity, in a salt- 
dependent manner, a Kp of 200 and 40 nM at low and 
high salt, respectively. Hsmr has a highly unusual se- 
quence, with over 40% of the amino acids being valine 
and alanine residues. These are distributed throughout 
the protein and are often concentrated in regions where 
there is little or no sequence conservation with other 
transporters. The authors suggested that this phenom- 
enon is the outcome of a natural process of alanine and 
valine preference that is caused by the high GC con- 
tent of the gene (69). 


Table 2. Comparison of the predicted ABC-transport clusters in the three Sulfolobus species 


S. solfataricus (SSO) S. acidocaldarius (Saci) S. tokodaii 

Family” Binding Permease(s) ATPase(s) Binding Permease(s) ATPase(s) Binding Permease(s) ATPase(s) 

protein protein protein 
Amino acid 2835%¢ 2839 2841 2836 3838 = - 
Antibiotic - 1080 - 1078 - - = 
Antibiotic - 1936 - 1934 - 2304 - 2305 - - - 2348 - 2437 - 
Di/oligopeptide 1273 1274 1275 1276 1277 = = 
Di/oligopeptide 1288 1283 1284 1281 1282 1760 1763 1762 1764 1765 0706 0702 0704 0700 0701 
Di/oligopeptide@ 3053 3058 3059 3055 - - - 
Di/oligopeptide’ 2669 2668 2671 2670 2672 - 2534 2535 2532 2536 2537 
Di/oligopeptide 3043 3047 3048 3045 3046 - - 
Di/oligopeptide - - - 1648 1649 1647 - 
Di/oligopeptide 2619 2617 2618 2615 2616 1038/ 1037 1036 1035 1034 2539 2540 2541 2542 2543 
Inorganic ion 2029 2031 - 2030 = 1811 1809 - 1810 = 1137 1135 5 1136 = 
Inorganic ion 1894 1992 - 1893 - - - 
Iron 0485 0486 - 0487 = 1220 1221 - 1222 = 0840 0841 = 0840 = 
Multidrug - 0051 - 0053 - - - 0609 - 0608 - 
Multidrug - 1318 - 1319 - - - 
Multidrug -= 2135 = 2137 = 1007 = 1006 = = = 1771 = 1772 = 
Multidrug - 2404 - 2402 - - 0945 - 0946 - - 0536 - 0535 - 
Multidrug - 2601 - 2600 - - = 
Multidrug - 2646 - 2647 - - 1800 - 1799 - - 0755 = 0756 - 
Multidrug - 3012 - 2123 f f - 1099 
Multidrug -= 3168 = 3169 = = 2303 = 2302 = = 2445 = 2446 = 
Multidrug - - - 2510 - 2509 - 
Nitrate - 2468 - 2469 - - 0258 - 0257 - - 0594 - 0595 - 
Phosphate 0489f 0490 0491 0488 - - - 
Sugar 1171/ 1169 1170 1168 - 1165/ 1164 1163 1166 - 1103/ 1104 1105 1106 = 
Sugar 2712f 2714 = 2713 = = 1915 1918 = 1917 = 
Sugar (glucose) 2847f 2848 2849 2850 - - = 
Sugar (arabinose) 3066 3067 3068 3069 - - - 
Sugar (trehalose) 0999f 1000 1001 1003 - - - 
Sulfate - 1032 - 1034 - - 0114 - 2033 - 2035 - 


“As assigned by the COG database. 

’Numbers indicate the ORF numbers as annotated in the different genome sequences (13, 43, 85). 
‘The entries along the same row indicate homologous sets of genes in the respective organisms. 
4Maltose transporters. 

“Cellobiose transporters. 

‘Binding proteins with type IV pilin signal peptides. 
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PERSPECTIVE: THE NEXT FIVE YEARS 


Membrane proteins from (hyper)thermophiles are 
very amendable to structural studies, because they tend 
to crystallize much more easily than their mesophilic 
counterparts (see Chapter 20). The crystallization of 
membrane proteins has caused problems for many 
years as it has been difficult to obtain sufficiently large 
amounts of purified membrane proteins. Moreover, the 
need for lipids or detergents to prevent precipitation of 
these hydrophobic proteins in the crystallization solu- 
tions causes additional problems. Expression of ar- 
chaeal membrane proteins remains a bottleneck. 
Whereas most archaeal cytoplasmic proteins are easily 
produced in bacterial hosts, archaeal membrane pro- 
teins are difficult to express heterologously. The recent 
development of expression systems for Sulfolobus (3) 
and Thermococcus (80) may provide new opportunities 
for producing large quantities of functional archaeal 
membrane proteins. To date, only five structures of ar- 
chaeal membrane proteins have been solved. Bacteri- 
orhodopsin was the first that was crystallized; a process 
that was greatly facilitated by the presence of these pro- 
teins in a semicrystalline state in the halobacterial mem- 
brane. Other examples of archaeal membrane protein 
structures are for the voltage-gated channel of M. 
thermautotrophicus, MthK (36), the voltage-dependent 
K*-channel from A. pernix (35), the glutamate trans- 
porter of P. horikoshii (101), and the protein-secretion 
(SecY) complex from M. jannaschii which is involved in 
the translocation of proteins across the cytoplasmic 
membrane (92) (see Chapter 17). 
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Chapter 17 


Protein Translocation into and across Archaeal 
Cytoplasmic Membranes 


MECHTHILD POHLSCHRODER AND KIERAN C. DILKS 


INTRODUCTION 


The cytoplasmic membrane is an amphiphatic 
structure that separates the interior of the cell from 
the external environment. However, interactions be- 
tween the cytoplasm and the extracytoplasmic envi- 
ronment are essential for cellular life. For example, 
cells must communicate signals across the membrane, 
secrete toxins, and take up nutrients. The latter pro- 
cess not only requires integration of transporters, but 
also the secretion of substrate-binding proteins across 
the membrane (see Chapter 16). Additional proteins 
that need to be secreted include, among others, poly- 
mer-degrading enzymes and subunits of extracyto- 
plasmic cellular structures, such as the cell wall, fla- 
gella, or pili (see Chapters 14 and 18). Thus, the 
ability to translocate proteins into and through the 
hydrophobic membranes that provide the semiperme- 
able barrier to the cytoplasm is required by all forms 
of life. 

A few proteins have been reported to sponta- 
neously insert into the membrane without the aid 
of protein “machineries.” However, most membrane 
proteins and substrates translocated across the mem- 
brane require such machineries to catalyze the 
translocation process. Several pathways have been 
identified that are involved in the translocation of un- 
folded proteins into or across the endoplasmic retic- 
ulum (ER) and cytoplasmic membranes, in eucarya, 
bacteria, and archaea, respectively (49, 79, 98). The 
secretion of the majority of these proteins is thought 
to depend on the universally conserved Sec pathway 
(79). Substrates translocated via the Sec pore can be 
translocated cotranslationally or posttranslationally, 
after much of the protein has been synthesized and 
extruded from the ribosome. The mechanisms of co- 
and posttranslational translocation through this pore 


vary among organisms and require distinct compo- 
nents for substrate recognition and protein translo- 
cation (53, 86). Many bacteria and archaea also en- 
code a Sec-independent translocation machinery, the 
Tat pathway, dedicated to the secretion of cytoplas- 
mically folded proteins (11). 

While the most extensive studies of these translo- 
cation pathways have been conducted in bacteria and 
eucarya, recent analyses of the archaeal Sec and Tat 
pathways have revealed novel and crucial informa- 
tion about archaeal protein translocation, as well as 
protein translocation in general. 


THE Sec TRANSLOCATION PATHWAY 


The Sec pathway is the only known universally 
conserved protein translocation pathway. To translo- 
cate proteins into and across cytoplasmic and ER 
membranes in bacteria, archaea, and eucarya, respec- 
tively, the Sec pathway must: (i) distinguish Sec sub- 
strates from other proteins; (ii) distinguish proteins 
that are integrated into the membrane as opposed to 
those translocated across the lipid bilayer; (iii) trans- 
port these proteins across or into the bilayer; and fi- 
nally, (iv) allow for the translocation of proteins with- 
out disrupting the integrity of the membrane. 


Sec Substrates, Their N-Terminal Signal Peptides, 
and Signal Peptidases 


Integral membrane proteins 


Essential cellular processes, such as the commu- 
nication between the cytoplasm and the extracellular 
environment, uptake of nutrients, or the generation 
of ATP, require proteins that are embedded in the lipid 
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bilayer. In archaea, these include well-characterized 
membrane proteins, such as bacteriorhodopsin or 
subunits of voltage-gated channels. It is believed that 
most of these proteins are inserted into the lipid bi- 
layer via the Sec pathway. While little is known about 
archaeal membrane protein insertion, it is thought 
that, similar to bacteria and eucarya, targeting of 
membrane proteins to the Sec pore involves the recog- 
nition of the first hydrophobic transmembrane seg- 
ment (TM) by the cytoplasmic signal recognition par- 
ticle (SRP) (see “Protein targeting” below) as it is 
being translated (26, 35). 

An exception seems to be bacterioopsin (BO), a 
seven-ITM polypeptide in the Halobacterium sali- 
narum cytoplasmic membrane (104). BO, the apo- 
protein of bacteriorhodopsin (BR), is a light-driven 
proton pump that has been attractive for membrane 
insertion studies, as high-resolution structural analy- 
sis has revealed its topology in the native membrane. 
BO is synthesized with an N-terminal 13-amino-acid 
presequence that is distinct from a TM or any of the 


Sec signal sequences 


class | 


class II 
(Lipoprotein) 


class III 
(Type IV pilin-like) 


Tat signal sequences 
Class | 


class II 
(Lipoprotein) 
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known Sec signal sequences (see below and Fig. 1). 
Several studies are consistent with a model in which 
BO is inserted into the membrane by the Sec pathway 
(40). In this model, the SRP is localized to the mem- 
brane upon induction of BO synthesis (40). However, 
reports of posttranslational translocation of BO con- 
flict with this model and highlight the need for addi- 
tional experiments to determine its mechanism of 
translocation. 


Secreted substrates and their N-terminal signals 


Many Sec substrates, such as substrate-binding 
proteins, polymer-degrading enzymes, or cell wall 
subunits, are secreted and are either anchored to the 
membrane or released into the external environment. 
These substrates possess N-terminal signal peptides 
that resemble a TM, but also contain charged amino 
acids at the N terminus that interact with negatively 
charged phospholipids at the cytoplasmic face of the 
membrane, thus orienting the N terminus of the sig- 
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Figure 1. Schematic representation of different classes of Sec and Tat signal sequences. Grey and hatched boxes represent N-ter- 
minally charged and hydrophobic (H region) domains, respectively. Arrows indicate the signal peptide cleavage sites. Cleav- 
age of predicted class 2 signal peptides of Tat substrates by SPII has not yet been confirmed experimentally. Modified from 
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nal sequence into the cytoplasm as the substrate is be- 
ing translocated (108) (Fig. 1). Moreover, unlike sig- 
nal-anchor sequences in membrane proteins, the sig- 
nal sequences are often removed upon translocation 
by a signal peptidase (SPase) (80). Depending on the 
peptidase cleavage site, signal sequences are divided 
into three classes (Fig. 1). 

Class I signal peptides are cleaved from the pre- 
protein by the universally conserved, membrane-as- 
sociated, type I SPase (SPI). Consequently, the ma- 
ture proteins are either released from the membrane 
or anchored to the lipid bilayer via a C-terminal 
membrane anchor. In bacteria, the SPI consists of a 
catalytic core (domain I) and, in some cases, a sec- 
ond domain (domain II) whose role is not defined 
(80). The eucaryal homolog of this SPI, SPC18, is part 
of the ER membrane-bound heterooligomeric signal 
peptidase complex (SPC) (29). In contrast to the bac- 
terial SPI, SPC18 does not contain a domain II, and 
its catalytic site involves a Ser-His-Arg triad rather 
than the bacterial Ser-Lys dyad (29, 80, 100). It is in- 
triguing that, while all known archaeal SPIs possess 
a putative eucaryal Ser-His-Arg triad, some of them 
also possess a domain that has similarity to the bac- 
terial domain II (3, 27). Site-directed mutagenesis of 
the Methanococcus voltae SPI revealed the require- 
ment of the conserved Ser?? and His! for activity 
but, surprisingly, the presumed active-site Asp14? 
(corresponding to Asp?’ in Escherichia coli) was not 
essential (8). Conversely, mutation of a second aspar- 
tate residue (Asp!*8, equivalent to E. coli Asp?®°) had 
a severe negative effect on enzyme activity (8). 

Computational analyses of signal sequences 
from the euryarchaeon Methanococcus jannaschii 
(74), and the crenarchaeon, Sulfolobus solfataricus 
(3), suggest that these signal sequences possess a 
charge distribution similar to those of bacteria, but 
their SPase recognition site is similar to that of eu- 
carya. However, despite these observed differences 
among the signal peptides and SPIs, class I signal pep- 
tides from both eucarya and archaea are properly tar- 
geted and processed when expressed in bacteria (85, 
105, 108). 

While some of these substrates are released into 
the extracellular environment, others are anchored 
to the membrane via C-terminal hydrophobic do- 
mains. For example, several archaeal S-layer glyco- 
proteins contain typical class I signal peptides and are 
associated with the membrane via a C-terminal an- 
chor (see Chapter 14). 

Certain archaeal Sec signal sequences resemble 
bacterial class II signal peptides, which contain a 
lipobox motif ([I/L/G/A]-[A/G/S]-C) and are recog- 
nized by the type II SPase (SPII) (Fig. 1). In bacteria, 
the terminal cysteine of the lipobox is modified upon 


translocation by the addition of a diacylglycerol moi- 
ety via a thioester linkage (48). Only acylated sub- 
strates are recognized and cleaved by the SPII, which 
temporally ensures that the substrate is able to be an- 
chored in the membrane via lipid prior to cleavage of 
the membrane-anchoring signal sequence (see Chap- 
ters 11, 14, and 15). The new amino terminus is fur- 
ther acylated, resulting in the mature lipoprotein that 
will be anchored to the cytoplasmic- or outer mem- 
brane (95). While many secreted archaeal proteins do 
possess signal peptides with a motif that resembles a 
lipobox (13, 59, 60), and recent data suggest that 
lipid modification occurs at the conserved cysteine 
(65), homology searches have not resulted in the iden- 
tification of an archaeal SPI homolog. This may in 
part be due to the addition of slightly distinct lipid 
moieties (C39C29 diphytanyl diether) to the lipobox 
cysteine in archaea, which may require unique fea- 
tures of an SPI for recognition. 


Extracellular protein structures 


Archaea have evolved distinct mechanisms to as- 
semble subunits of extracytoplasmic structures after 
translocation via the Sec pathway. For example, ar- 
chaeal flagellins are synthesized as preproteins with 
signal sequences that are cleaved before the incorpo- 
ration of the protein into the flagellar filament (9, 
106). However, distinct from class I and class II signal 
peptides, flagellin signal sequences (class III), initially 
identified in bacterial type IV prepilin subunits, con- 
tain a SPase (prepilin-peptidase)-processing site that 
precedes the hydrophobic stretch (Fig. 1 and Table 1). 
This hydrophobic stretch constitutes part of the ma- 
ture protein and is essential for initial membrane- 
anchoring and subunit-subunit interactions that are 
critical for the biosynthesis of type IV pili and pilus- 
like structures. Note that archaeal flagellins do not re- 
semble bacterial flagellins, and bacteria have evolved 
a Sec-independent pathway for flagellin translocation 
and flagella biosynthesis (62) (see Chapter 18). 

Archaea also appear to utilize class II signal pep- 
tides for the translocation and biosynthesis of other 
extracytoplasmic structures. In addition to the archaeal 
flagellins, several sugar-binding proteins in S. solfata- 
ricus have been shown to contain typical class III sig- 
nal peptides (4). While the reason these binding pro- 
teins possess class III signal peptides is currently 
unknown, it is tempting to postulate that they also as- 
semble into surface structures (“bindosomes”) that 
may provide a selective advantage under certain growth 
conditions (2). 

Recent in silico analysis of all available com- 
pletely sequenced archaeal genomes using a perl pro- 
gram that predicts archaeal class II signal peptides 
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Table 1. Signal peptide classes in Archaea 


Organism/protein 


Signal sequence” 


Processing peptidase 


Sec signal sequences 

Class 1 
Halobacterium salinarum Csg [59] 
Methanothermus fervidus S1gA [15] 
Methanococcus maripaludis 1p” 


Class 2 
Archaeoglobus fulgidus G1nH [4] 


Methanocaldococcus jannaschii BraC [4] MPYYWGAILIGGVFLAG CT 


Pyrococcus abyssi Ma1E [4] 


Class 3 
Methanococcus voltae FlaB1 [15] 
Methanococcus vannielii FlaB1 [51] 
Sulfolobus solfataricus TreS [3] 


Tat signal sequences 

Class 1 
Natronococcus sp. Amy [88]¢ 
Sulfolobus solfataricus Oxyr [88]° 
Archaeoglobus fulgidus Se1A [88]° 


Class 2 
Halobacterium salinarum BasB [57]° 
Halobacterium salinarum CosB [57]? 
Natronomonas pharaonis Hcy [64]? 


MHSTTRREWLGAIGATAATGLG CA 
MMDTPEHASTSSRRQLLGMLAAGGTTAVAG CT 
MKDISRRRFVLGTGATVAAATLA CN 


SPI 
MTDTTGKLRAVLLTALMVGSVIGAGVAFTGGAAA AN 
MRKFTLLMLLLIVISMSGIAGA AQ 
MAMSMKKIGAIAVGGAMVASALATGAFA AE 

SPII¢ 
MKKVVPILVLLAALLLLG CT 
MKRGIYAVLLVGVLIFSVVASG CI 

FlaK/PibD 
MNIKEFLSNKKG AS 
MSVKNFMNNKKG DS 
MSRSDKFSNKEKMRRG LS 

SPI 
MRRNHSHTSDSAGIDRRTVLRSSAAAGALALTGVTIGSTSAAA RS 
MLKLSRRDFLKISGATAVATAFILGGNSVA KR 
MRKVMNSPDDGNGRRRFLOFSMAALASAAAPSSVWAES KI 

SPIE 


“Cleavage site is indicated by arrowhead. Conserved amino acids at the cleavage site are in bold. The Tat motif is underlined. 


PD., VanDyke and K. F. Jarrell, unpublished data. 
SPI homolog not found in archaea; SPII analog predicted. 


dSignal peptidase cleavage site identified but no direct evidence for SPI cleavage. 


Predicted signal peptidase cleavage site. 


revealed the presence of a diverse set of proteins with 
class III signal peptides in archaea other than Sul- 
folobus. These include, among others, binding pro- 
teins as well as putative archaeal type IV pilins. 
Genes encoding proteins identified by the perl pro- 
gram, FLAFIND (http://signalfind.org), were often 
located in an operon with other FLAFIND positives, 
prepilin peptidases, and/or bacterial ATPases that 
may provide energy for assembly of the extracellular 
structures (104a). Further investigation will provide a 
better understanding of the diversity of bacterial and 
archaeal substrates containing such signal peptides, 
presumably for the biosynthesis of multimeric cell 
surface-associated structures. 


Protein Targeting 


The initial TM or H region of many Sec substrates 
is recognized by the SRP as it emerges from translat- 
ing ribosomes (26) (Fig. 2). After the SRP-ribosome 
nascent chain (RNC) complex is targeted to the Sec 
pore, the substrate is translocated cotranslationally 
(26, 79, 99). SRP-independent substrates are translo- 
cated posttranslationally, requiring SRP-independent 


targeting factors, chaperones to keep the precursor in 
a translocation-competent conformation, and an ATP- 
ase to drive the translocation process (85). While 
translocation across membranes can either be co- 
or posttranslational, membrane protein insertion is 
thought to mainly occur cotranslationally. 


SRP-dependent targeting to the Sec pore 


The SRP is composed of an RNA molecule and 
one or more protein subunits (26). In eucarya, where 
this pathway has been studied extensively, the SRP 
binds to the ribosome next to the peptide exit site. At 
this site on the ribosome, the S domain of the SRP 
RNA and the universally conserved SRP54 protein 
component sample nascent peptide and recognize sig- 
nal sequences as they are translated (110) (Fig. 2). 
Once bound to an appropriate signal sequence, the 
SRP then undergoes a conformational change that al- 
lows for the binding of the second domain within the 
RNA (termed the Alu domain) to adhere to the rRNA 
that overlaps with the elongation factor-binding site 
of the ribosome (A site) (57, 63, 111). As the Alu do- 
main blocks both elongation factors and tRNAs from 
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RNC RNC-SRP 


Elongation 


Signal sequence 


Signal sequence binding 


Figure 2. Mammalian SRP interaction with the ribosome. Upon cytoplasmic exposure of the initial TM or H region of many 
Sec substrates, the SRP interacts with the ribosome nascent chain (RNC) complex via several points of interaction. The SRP54 
protein recognizes and binds the nascent polypeptide, while SRP9/14 bind and block the A site (see text). The bending of the 
SRP RNA molecule required for both of these interactions to occur simultaneously may be facilitated by SRP68/72. SRP pro- 
teins are represented by their corresponding numbers. Modified from Current Opinion in Structural Biology (25) with per- 


mission of the publisher. 


entering the ribosome, translational elongation is 
likely to be arrested (42). The RNC-SRP complex is 
then targeted to the membrane, mediated by the affin- 
ity of SRP for the membrane-associated SRP receptor 
(SR) (73). The SR is anchored to the ER membrane 
by its SRG-subunit, while the soluble SRa-subunit is 
responsible for the interaction with SRP54 (66) (Fig. 
2). Once the SR has escorted the RNC-SRP complex 
to the translocon, SR and SRP disengage in a GTP- 
dependent manner. Concurrent with the resumption 
of translation, the Sec substrate is transported through 
the Sec pore (see “The Sec pore,” below) (114). 

Most known bacterial SRP RNA homologs pos- 
sess a 7S RNA with both the S and the Alu domains 
(77). Despite this prevalence, only one bacterial SRP 
protein that interacts with the Alu domain, the Bacil- 
lus subtilis protein Hbsu, has been identified (71, 81). 
Moreover, some phylogenetically diverse bacteria 
(e.g., E. coli and Streptomyces lividans) possess a 
minimalistic, yet essential, SRP. Their SRP consists 
only of an RNA molecule (that lacks the Alu domain) 
and the universally conserved SRP54 homolog. 

The archaeal SRP is essential and most closely re- 
sembles the eucaryal SRP. The 7S RNA and SRP54 
homologs are more similar to their eucaryal than to 
their bacterial counterparts, and they contain a ho- 
molog of the eucaryal SRP19 subunit (68, 69, 91, 
92). While no additional archaeal homologs of either 


eucaryal or bacterial SRP proteins have been identi- 
fied, a recent analysis of archaeal small RNAs sug- 
gests an interaction of the ribosomal protein L7Ae 
with the S. solfataricus 7S RNA (118) (see Chapter 
8). It has been postulated that such an interaction 
may facilitate the bending of the RNC-bound SRP to 
allow the Alu domain of the RNA to interact with the 
ribosomal exit tunnel and its A site to arrest transla- 
tion (Fig. 2). Since homology searches have not iden- 
tified homologs of the eucaryal proteins binding to 
this domain (14, 85), it is intriguing to speculate that 
the archaeal ribonuclear Alu domain by itself may be 
sufficient for pausing translation. Alternatively, bio- 
chemical characterization of bacterial and archaeal 
SRPs may reveal the presence of additional protein 
components involved in translational arrest. 

Similar to the apparently more simple bacterial 
and archaeal SRP, the essential bacterial and archaeal 
SRP receptor homologs (FtsY) have a less complex 
composition, as they lack the eucaryal SR®-subunit 
(Fig. 3). Consistent with the absence of the SRB-subunit 
in bacteria and archaea (which is required for tight as- 
sociation of the SRP receptor to the membrane in eu- 
carya), cell-fractionation experiments showed that at 
least 50% of the FtsY is localized in the cytoplasm (41, 
61, 67). Recent copurification studies of the membrane- 
associated E. coli FtsY suggest that the bacterial and ar- 
chaeal FtsY interacts directly with the Sec pore (7). 


Figure 3. (See the separate color insert for the color version of this illustration.) Crystal structure of the M. jannaschii Sec61 protein-conducting channel. Views 
from the top (a) and the front (b). Faces of the helices that form the signal-sequence-binding site and the lateral gate through which TMs of nascent mem- 
brane proteins exit the channel into lipid are colored. The plug, which gates the pore, is green. The hydrophobic core of the signal sequence probably forms a 
helix, modeled as a magenta cylinder, which intercalates between TM2b and TM7 above the plug. Intercalation requires opening the front surface as indi- 
cated by the broken arrows, with the hinge for the motion being the loop between TMS and TM6 at the back of the molecule (5/6 hinge). A solid arrow 
pointing to the magenta circle in the top view indicates schematically how a TM of a nascent membrane protein would exit the channel into lipid. Structure 
and legend reprinted from Nature (107) with permission from the publisher. 
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SRP-independent targeting to the Sec pore 


The translation of Sec substrates emerging from 
the ribosome that are SRP independent are translo- 
cated after much of the protein has been translated. 
Thus, these substrates not only require SRP-indepen- 
dent targeting factors, but also require chaperones to 
keep them in an unfolded, translocation-competent 
conformation. One such chaperone in E. coli, the 
SecB chaperone, binds to slowly folding Sec sub- 
strates (89). The bound substrate is then targeted to 
the Sec pore by way of the affinity of SecB for SecA, 
the bacterial translocation ATPase that is crucial for 
providing energy for bacterial posttranslational pro- 
tein translocation (109; and see “The Sec pore,” be- 
low). SecA is also able to interact with Sec substrates 
in a SecB-independent manner. In fact, recent data 
suggest a direct interaction of SecA with the ribo- 
some, where it may also be involved in the selection 
of emerging posttranslationally translocated nascent 
chains (55). While no SecB homolog has been identi- 
fied in eucarya, universally conserved chaperones, 
such as Hsp70, are required for the targeting of un- 
folded proteins to the eucaryal Sec pore. Similarly, 
neither a SecB nor a SecA homolog has been identified 
in archaea, and distinct cytoplasmic chaperones may 
be involved in this process (14, 86). However, it is un- 
clear whether posttranslational protein translocation 
occurs in archaea (see “Energetics,” below). 


The Sec Pore 


The Sec pore consists of two universally con- 
served components, Sec61a and Sec61y (SecY and 


ADP + P, 


Escherichia coli 


Haloferax volcanii 


SecE, respectively, in bacteria). A third component, 
Sec61B in archaea and eucarya or SecG in bacteria is 
also involved (44, 46, 58, 107) (Fig. 3). In both bac- 
teria and eucarya, SecY/Sec61a and SecE/Sec61y are 
essential, while the third pore subunit (SecG or 
Sec61, respectively) is dispensable (32, 33). The ar- 
chaeal Sec61aBy homologs are most similar to the 
eucaryal counterparts. A high-resolution X-ray crys- 
tal structure of the Sec pore from the archaeon, 
M. jannaschii (the first and only such Sec pore struc- 
ture for any organism), revealed that it is a 1:1:1 het- 
erotrimer, with SecY forming a channel-like structure 
through which the nascent secretion substrate is likely 
to be translocated (Fig. 4) (107). The hourglass- 
shaped channel possesses a 5- to 8-A diameter con- 
striction between cytoplasmic and external hydro- 
philic funnels in the center of the membrane, which 
is lined with hydrophobic residues. It is blocked by a 
small helical segment on the extracytoplasmic side 
that has been called the “plug,” and in that configu- 
ration, marks the closed channel. While the structural 
data of this archaeal Sec pore were generally consis- 
tent with previous topology predictions based on ge- 
netic analyses of the E. coli pore, it also led to a revision 
of previous interpretations of some data. Combining 
mutant signal sequence recognition data with the 
structure allowed for the proposal of a “clamshell” 
model for the transition from a closed inactive form 
of the pore to an open active form. These analyses 
suggest that the plug is being moved from the 5- to 
8-A constriction to the external side of the cell. The 
structure also revealed the site of the complex that is 
likely to open laterally toward the lipid phase to al- 
low for TM integration into the membrane. 


ADP +P, 


Saccharomyces cerevisiae 


Figure 4. Sec machinery in the three domains of life. Components of the Sec machinery in representatives of bacteria (E. coli), 
archaea (H. volcanii), and eucarya (S. cerevisiae). Sec substrates are translocated into or across hydrophobic membranes via 
the universally conserved heterotrimeric Sec61 (SecYEG in bacteria) pore. Translocation through this protein-conducting chan- 
nel requires distinct sets of additional Sec components in bacteria, archaea, and eucarya. YidC and TRAM are only involved in 
the insertion of proteins into the bacterial cytoplasmic and the ER membrane, respectively. While ATP hydrolysis by SecA and 
Kar2p are involved in energizing Sec translocation in bacteria and eucarya, respectively, no archaeal translocation ATPases 
have been identified. Cyt, cytoplasm. Reprinted from Current Opinions in Microbiology (84) with permission of the publisher. 


376 POHLSCHRODER AND DILKS 


Sec Pore-Associated Components 


In vitro reconstitution studies suggest that the 
mammalian Sec-pore components are the only mem- 
brane components required for cotranslational 
translocation of proteins across lipid bilayers (39) 
(Fig. 3). However, it is thought that in bacteria and eu- 
carya additional pore-associated components are nec- 
essary for efficient co- and posttranslational protein 
translocation. None of these accessory components 
are universally conserved. For example, eucarya con- 
tain Kar2p, an ER luminal ATPase, which binds to nu- 
merous Sec substrates as they emerge from the Sec 
pore and prevents their retrograde movement into the 
pore. Kar2p may also be involved in the gating of 
the translocon (5, 43). This ATPase is associated with 
the pore via its interaction with the membrane pro- 
tein, Sec62, which together with Sec63 forms a sub- 
complex of the Sec pore (83, 93, 94). While Kar2p is 
not thought to be essential for mammalian cotransla- 
tional protein translocation, both co- and posttrans- 
lational protein translocation in yeast requires this 
ATP-dependent chaperone. Another eucaryal Sec-pore- 
associated component, the oligosyltransferase (OT), 
may be involved in giving the substrate directionality 
by glycosylating proteins emerging from the pore, thus 
preventing their retraction. Several additional proteins 
interact with the Sec pore, including TRAM, TRAP, 
and RAMP, although their function in translocation 
is not clear (31, 34, 37-39, 54, 101, 103, 113). 

The E. coli SecYEG pore copurifies with the four 
distinct protein components, YidC, and the het- 
erotrimeric complex SecDFYajC (Fig. 3) (23, 24, 76, 
98). In addition, it transiently associates with the cy- 
toplasmic translocation ATPase, SecA (109). This 
ATPase binds substrate, inserts into the Sec pore with 
the substrate, releases the substrate, and exits back out 
of the pore. This ATP-dependent cycle effectively re- 
sults in “pushing” the substrate through the pore. 
While initially it was thought that a “push” through the 
pore was only required for the translocation of pro- 
teins across the membrane, recent studies suggest that 
the insertion of certain membrane proteins also de- 
pends on SecA (18, 88). A second component that is 
critical for the biogenesis of a subset of Sec-dependent 
(as well as Sec-independent) membrane proteins is 
YidC (117). Eucaryal homologs of YidC have only 
been identified in the mitochondrial inner membrane 
and the thylakoid membranes of chloroplasts (Oxa1 
and Alb3, respectively) (116). While its exact func- 
tion is not clear, it has been suggested to be involved 
in lateral integration, folding, and assembly of mem- 
brane proteins in bacteria and eucarya (17). The in- 
teraction of YidC with the SecYEG pore in bacteria 
may be mediated by SecDFYajC (76). 


In addition to promoting the interaction of YidC 
with the bacterial translocon, the SecDFYajC com- 
plex may promote the membrane cycling of SecA and 
may stabilize SecG (24, 56). Moreover, both SecD 
and SecF contain large periplasmic domains that are 
crucial for the function of the protein. This indicates 
these membrane proteins may play an extracytoplas- 
mic role late in translocation by facilitating the re- 
lease of Sec substrates from the pore (64). 

All complete archaeal genome sequences contain 
homologs of the eucaryal OT but apparently lack ho- 
mologs of the other eucaryal Sec accessory compo- 
nents (14, 86). Furthermore, all genome sequences for 
Euryarchaeota encode a distant homolog of the bac- 
terial YidC (116). However, it is not clear whether 
this protein is involved in archaeal membrane pro- 
tein insertion (116). Many Euryarchaeota also con- 
tain homologs of SecD and SecF (14, 86) (Fig. 3). The 
amino acid similarity among bacterial and archaeal 
SecD and SecF homologs is more significant than 
among bacterial and archaeal YidC homologs, and 
the predicted membrane topologies of these archaeal 
membrane proteins are conserved with the bacterial 
counterparts (28). 

Archaeal SecD and SecF have been studied in 
vivo in the model archaeon, Haloferax volcanii (45). 
Similar to E. coli, the H. volcanii SecD and SecF pro- 
teins form a membrane complex and a deletion of the 
secFD operon leads to a strong cold-sensitive pheno- 
type. Consistent with their involvement in Sec 
translocation, the secFD knockout strain exhibits a 
specific Sec-protein export defect (45). The structural 
and functional conservation among the bacterial and 
archaeal homologs suggests a common function. 
However, the proposed functions of the bacterial 
SecDFYajC complex include SecA cycling and SecG 
stability. Since the archaeal Sec pore does not contain 
a SecG homolog, and no archaeal SecA homolog has 
been identified, the mechanism has to be independent 
of SecA and SecG. Additional studies are necessary 
to reveal the role of SecDF in facilitating bacterial and 
archaeal protein translocation. 


Energetics 


Co- and posttranslational Sec translocation in 
S. cerevisiae and E. coli requires a luminal and cyto- 
plasmic ATPase, respectively (79). As archaea have a 
similar morphology to bacteria, it is unlikely that they 
possess an extracytoplasmic translocation ATPase 
analogous to eucaryal Kar2p, since ATP is not readily 
available in the extracellular environment. However, 
it is intriguing that archaea also lack a homolog of 
the bacterial SecA. Since no functionally equivalent 
ATPases have been identified in archaea, mechanisms 
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by which energy is generated for archaeal protein 
translocation remain elusive. 

Considering what is known about energetics of 
translocation in bacteria and eucarya, at least four 
distinct mechanisms for archaeal protein transloca- 
tion may be proposed. It is also possible that any 
number of these energy-coupling mechanisms may 
act synergistically to drive translocation through the 
archaeal Sec pore (Fig. 5). 

As noted in “The Sec pore,” above, in vitro re- 
constitution studies of the mammalian translocation 
system suggest that proteins can be translocated across 
lipid bilayers in an ATP-independent manner. This in- 
dicates that translation elongation may be the sole 
driving force of unidirectional movement of proteins 
through the Sec pore (39) (Fig Sa). This is an attrac- 
tive model for archaeal protein translocation, in view 
of the absence of a SecA or Kar2p homolog. How- 
ever, while this may be one route of archaeal protein 
translocation, recent secretion studies across the H. 
volcanii cytoplasmic membrane suggest that two sub- 
strates are translocated across the archaeal cytoplasmic 
membrane, entirely by a posttranslational mechanism 
(50). Moreover, the translocation of a Halobacterium 
sp. NRC-1 bacterioopsin fusion construct was dem- 
onstrated to occur posttranslationally (78). While 
these studies indicate that posttranslational protein 
translocation is possible in archaea, it should be noted 
that the substrates used in both studies were heterolo- 


Figure 5. Models of putative archaeal protein translocation energet- 
ics. See text for details. (a) Cotranslational translocation. (b) Post- 
translational translocation with a cytoplasmic energy-coupling 
protein. (c) Posttranslational translocation with extracytoplasmic 
activity. (d) Posttranslational translocation harnessing a gradient 
(e.g., ApH) across the cytoplasmic membrane. Figure reprinted 
from FEMS Reviews (85) with permission of the publisher. 


gous fusion constructs. In addition, studies of BO in 
its native host are contradictory (see “Integral mem- 
brane proteins,” above), suggesting that its transloca- 
tion occurs cotranslationally (76). 

Data supporting posttranslational protein trans- 
location in archaea may indicate that an archaeal cy- 
toplasmic SecA-like protein exists and that it lacks 
significant amino acid sequence conservation with the 
bacterial SecA. Alternatively, a cytoplasmic nucleotide- 
hydrolyzing enzyme unrelated to SecA in structure 
and/or mechanism may be driving translocation 
through the Sec pore (Fig. 5b). 

Protein translocation may also be driven by one 
or several extracytoplasmic activities that provide di- 
rectionality by preventing movement of the polypep- 
tide chain back into the cytoplasm (Fig. 5c). In support 
of this, all complete genome sequences of archaea 
contain an OT, which may prevent the substrate from 
reentering the pore by glycosylating the translocat- 
ing polypeptide chains (see “Sec pore-associated 
components,” above). Similarly, the folding of the 
proteins aided by non-ATP-dependent extracytoplas- 
mic chaperones may result in the unidirectional 
movement of the polypeptide. SecD and SecF may 
play such a role in the mechanism of translocation 
(Fig. 5c). 

Finally, in vitro studies suggest that the proton 
motive force (PMF), in concert with the action of 
SecA, facilitates bacterial secretion via the Sec pore 
(70, 75). Furthermore, the PMF is apparently suffi- 
cient to drive translocation of proteins via the Tat 
pore (see “Tat machinery,” below). Thus, it is possible 
that an ion gradient across the archaeal membrane is 
the sole source of energy for protein translocation 
(Fig. 5d). 


THE Tat PATHWAY 


Many bacteria and archaea possess an additional 
general secretion pathway, described as the twin-argi- 
nine translocation (Tat) pathway. Unlike the Sec path- 
way, secretion across this pore is, apparently, solely 
driven by the proton gradient across the membrane, 
and substrates of the Tat pathway can be secreted af- 
ter significant, if not complete, folding of the protein 
has occurred in the cytoplasm (85). Since the Tat path- 
way does not require extracellular protein folding, or- 
ganisms that have this pathway possess a powerful 
alternative mechanism to achieve extracellular protein 
activity. Similar to the Sec pathway, recent analyses of 
the Tat pathway in archaea have provided insights 
into how the Tat pathway functions in all bacteria and 
archaea and revealed features of this secretory route 
that are ostensibly specific to archaeal species. 
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Tat Signals and Substrates 
Signals 


Although the Tat and Sec translocation ma- 
chineries are distinct, the features of the signals di- 
recting substrates to each pathway are remarkably 
similar. Similar to class I/II Sec signal peptides, Tat 
signals are located at the N terminus of the precur- 
sor and contain the same structure: a charged N ter- 
minus, followed by a stretch of hydrophobic amino 
acids (Fig. 1). In addition, Tat signal peptides often 
have a Sec-like, C-terminal signal cleavage site that is 
apparently processed by SPI (115). Despite the simi- 
larity of Sec and Tat signal peptides, general bio- 
chemical differences between the two must exist to 
prevent futile targeting of all Sec and Tat substrates to 
both pathways. 

The N-terminal charged region of a Tat signal 
sequence is represented by the conserved S/T-R-R-X- 
F-L-K motif (11). The core twin-arginine residues 
are essentially universally conserved, and replace- 
ment of these arginines with other amino acids often 
results in a significant inhibition of substrate secre- 
tion, presumably due to the loss of substrate recog- 
nition by the appropriate Tat machinery (1, 87, 
102). The surrounding residues are much less con- 
served. However, substitution of the surrounding 
residues with unfavorable residues can lead to a de- 
crease in translocation efficiency (102). In addition, 
the hydrophobic stretch of Tat signal peptides gen- 
erally has lower hydrophobicity than that found in 
Sec signal peptides (16). 

The presence of the twin-arginine motif in the 
Tat signal sequence provided a means of identifying 
novel Tat substrates by computational pattern-match- 
ing techniques. Initial searches for secreted proteins 
with the twin-arginine residues proved fruitful (51). 
However, a large number of falsely predicted Tat sub- 
strates were made by using this approach. Analysis 
of the genome of one halophilic archaeon, Halobac- 
terium NRC-1, revealed that nearly all of the putative 
secretory proteins possessed typical Tat signal pep- 
tides (90). This large pool of putative Tat signal pep- 
tides was used to develop a rule-based perlscript 
(TATFIND) that predicts these peptides based on nu- 
merous characteristics (90). TATFIND positively iden- 
tifies all confirmed Tat substrates in organisms such as 
E. coli and B. subtilis (22, 52, 90). Moreover, due to 
the large number of criteria required for a successful 
match by TATFIND, the number of falsely predicted 
Tat substrates decreased dramatically compared with 
early pattern-matching efforts. Since the development 
of TATFIND, additional programs have been devel- 
oped for Tat substrate identification, such as TatP, 
which utilizes both pattern matching and a neural 


network (10). Future experimental verification of 
bacterial and archaeal Tat signal peptides that were 
predicted by computational methods (as well as those 
the programs failed to identify) will provide useful in- 
formation to further optimize the sensitivity and 
specificity of these programs. 


Substrates 


Many of the earliest Tat substrates that were de- 
scribed were redox proteins from E. coli that incorpo- 
rate their cofactor in the cytoplasm (11). After synthe- 
sis, many of these proteins are bound by chaperones 
that prevent interaction with the secretion machinery 
until the substrate accepts its cofactor (82). Matura- 
tion to the holoenzyme is likely to render the sub- 
strates incompatible with secretion through the 5- to 
8-A Sec pore constriction (107). Hence, the Tat path- 
way was originally viewed as a specialized transport 
system required mainly for a small class of apoen- 
zymes whose prosthetic group is incorporated prior 
to translocation. 

The availability of the TATFIND program made 
it possible to search entire genomes very rapidly for 
the presence of putative Tat substrates. By using this 
program, it became apparent that the Tat pathway 
was being used to vastly different extents, with the 
number of predicted substrates ranging from 0 to 
145 in different bacteria and archaea (22). Similar to 
E. coli, many bacteria and archaea appear to employ 
the Tat pathway for the export of cofactor-containing 
redox proteins. However, a large number (in some 
cases, the majority) of putative Tat substrates were 
not cofactor-containing redox proteins, and included 
proteases, virulence factors, substrate-binding pro- 
teins, and polymer-degrading enzymes. Thus, an im- 
portant finding of the signal peptide predictions from 
the largely archaeal-based program, TATFIND, was 
that the Tat system appeared to be a general secre- 
tion pathway, rather than a specialized translocation 
system (22). 

Many Tat substrates are released into the peri- 
plasm or extracellular space. However, numerous 
substrates are attached to the membrane via one of 
three distinct mechanisms: (i) interaction with mem- 
brane proteins, (ii) a C-terminal membrane anchor, or 
(iii) lipid modification. Five different Tat substrates 
have been experimentally shown in E. coli to possess 
a C-terminal transmembrane segment that integrates 
into the membrane in a YidC-independent fashion 
(47). Analysis of TATFIND results indicated that this 
class of substrate also exists in archaea. Comprehen- 
sive bioinformatic examination of putative Tat sub- 
strates from halophilic archaea indicates that the ma- 
jority of their Tat signal peptides contained typical 
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lipoboxes (13). It has since been demonstrated exper- 
imentally that mny of these proteins are secreted via 
the Tat pathway and that their membrane localiza- 
tion depends on the critical cysteine residue in the 
lipobox (K.C. Dilks and M. Pohlschröder, unpub- 
lished results). 

Sec substrates in E. coli that are anchored to the 
membrane in either of these ways require lateral in- 
tegration of the C-terminal transmembrane segment 
into the membrane or signal anchoring of the lipopro- 
tein in the membrane until it is processed by the 
proper peptidase and transferase/transacylase (72, 
107). Tat substrates may also achieve proper localiza- 
tion by means of lateral integration and signal an- 
choring. However, it is not clear how these mecha- 
nisms would be compatible with translocation through 
the Tat apparatus. 

Fourteen of the 23 complete archaeal genome se- 
quences available from NCBI possess the necessary 
genes (tatA and tatC) for a functional Tat pathway 
(Table 2). Examination of archaeal genome sequences 
with TATFIND predicted that most of these 14 or- 
ganisms encode very few Tat substrates, mostly redox 
proteins (Table 2). For example, TATFIND identified 
9 of the 14 putative Tat substrates in the genome of 
Picrophilus torridus as likely to be cofactor-containing 
redox proteins. In sharp contrast, the three members 
of the Halobacteriaceae (Haloarcula marismortui, 
Halobacterium NRC-1, and Natronomonas pharao- 
nis) encode a large number of putative Tat substrates, 
the majority of which are not redox proteins (Table 2). 
Computational analyses of the secretory proteins 
from these haloarchaea have revealed that this group 
of organisms uses the Tat pathway as the major 
translocation route, rather than the Sec pathway, 
which is used by most other organisms (13, 20, 30, 
90). The preferential routing of proteins to the Tat 
pathway in haloarchaea may be in response to the 
high cytoplasmic and extracellular salt concentrations 


(13, 90). Such salt conditions could increase the rate 
of spontaneous protein folding in the cytoplasm, or 
alternatively, necessitate a high level of chaperone- 
mediated protein folding in the cytoplasm. The Tat 
pathway, as opposed to the Sec pathway, would be 
able to accommodate the secretion of these folded 
proteins. 


Tat Machinery 


Three functionally distinct Tat machinery proteins 
have been identified: TatA, TatB, and TatC (Fig. 6A) 
(96, 112). TatA and TatB possess a single N-terminal 
transmembrane segment and a cytoplasmic C-termi- 
nal region composed of an amphipathic helix domain 
and a region that has not been structurally character- 
ized. TatC typically contains six TM segments with N 
and C termini localized in the cytoplasm. 

In E. coli, TatA, TatB, and TatC are integral 
membrane proteins that differentially interact to form 
two distinct oligomeric structures. A complex of mul- 
tiple heterodimers of TatB and TatC act as the initial 
site of membrane interaction for cytoplasmic Tat sub- 
strates (Fig. 6B) (1, 15). Tat substrates that have had 
the twin-arginine residues in the signal peptide altered 
lose the ability to interact with the TatBC structure 
(1). Once the TatBC complex is occupied by a sub- 
strate, it engages the second oligomeric structure (a 
ring-shaped TatA multimer) in a PMF-dependent 
manner (1, 15) (Fig. 6B). The cytoplasmic region of 
TatA presumably plugs one end of the aqueous mem- 
brane pore created by the TatA multimer (36). Due to 
its structure and late-stage interaction with Tat sub- 
strates, the TatA multimer is believed to act as the 
protein-conducting channel of the Tat pathway. Bio- 
chemical and microscopic data indicate that the size 
of the TatA structure varies from approximately 10 to 
40 copies (36, 77). The different sizes of TatA struc- 
tures would help rationalize how this pathway is able 


Table 2. The Tat pathway in Archaea deduced from complete genome sequences 


Organism Kingdom? No. of ORFs Putative Tat substrates TatA and TatC present 
Archaeoglobus fulgidus Eury 2,420 10 Yes 
Halobacterium sp. NRC-1 Eury 2,622 68 Yes 
Methanocaldococcus jannaschii Eury 1,786 0 No 
Methanosarcina acetivorans Eury 4,540 > Yes 
Methanothermobacter thermautotrophicus Eury 1,873 1 No 
Natronomonas pharaonis Eury 2,622 106 Yes 
Pyrobaculum aerophilum Cren 2,605 14 Yes 
Pyrococcus furiosus Eury 2,125 3 No 
Sulfolobus acidocaldarius Cren 25223 5 Yes 
Sulfolobus tokodaii Cren 2,825 4 Yes 
Thermoplasma acidophilum Eury 1,482 3 Yes 


Cren, Crenarchaeota; Eury, Euryarchaeota. 
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Figure 6. Tat components and model of Tat secretory mechanism. (A) Typcial structure of Tat machinery components in bac- 
teria and archaea. The postamphipathic helical C terminus for TatA and TatB has been excluded for visual simplicity. (B) Model 
of Tat substrate translocation in E. coli. Tat substrates (oval) obtain tertiary structure in the cytoplasm and are targeted to the 
membrane TatBC complex in an unknown manner. Once bound to substrate, the TatBC complex interacts with a multimeric 
TatA ring in a ApH-dependent manner. The plugged inactive TatA ring likely alters to an active unplugged confirmation upon 
engaging substrate. There are insufficient data describing points of protein interactions, and the depicted points of interaction 


between proteins are not meant to be completely accurate. 


to accommodate the passage of substrates of vastly 
different sizes, while maintaining a large degree of 
membrane impermeability. 

While many archaea encode multiple tatA and 
tatC genes, all archaea (and several bacteria) lack a 
copy of tatB (22). The absence of TatB from many or- 
ganisms was surprising in view of the requirement of 
TatB for Tat translocation in E. coli (97). However, 
recent studies in E. coli demonstrated that mutant al- 
leles of tatA could suppress the translocation defect of 
a tatB deletion (12). Thus, TatA homologs may pro- 
vide the function of both TatA and TatB in bacteria 
and archaea that do not have TatB. 

Analysis of the B. subtilis Tat pathway revealed 
the presence of TatA homologs in the cytoplasm, in 
addition to its presence in the membrane (87). Cyto- 
plasmic localization of TatA homologs was also ob- 
served in Streptomyces lividans and the haloarchaeon 
H. volcanii (19, 21). The B. subtilis TatA homolog 
(TatAd) coimmunoprecipitates with its cytoplasmic 


Tat substrate, PhoD (87). Note that, despite exten- 
sively investigating the mechanism of the Tat pathway 
in E. coli, it remains unclear how substrates are di- 
rected to the membrane TatBC complex (Fig. 6B). Al- 
though substrate interaction with cytoplasmic TatA 
may represent a membrane-targeting mechanism, it 
is inconsistent with the absence of this protein from 
the cytoplasm and the late translocation role of this 
protein in E. coli. Future investigations of the nature 
and role of cytoplasmic and membrane TatA in bac- 
teria and archaea will be useful in reconciling the lo- 
calization and function of this protein. 

The most well-defined archaeal Tat pathway is 
that of the model archaeon H. volcanii. As described 
above (“Tat signals and substrates”) computational 
predictions and recent experimental analyses of pu- 
tative Tat substrates indicate that the haloarchaea use 
the Tat pathway for the export of the majority of se- 
creted proteins. Genetic studies in H. volcanii, dem- 
onstrating the essential nature of this pathway, are 
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consistent with its predicted extensive use. The two 
TatC homologs in H. volcanii and other haloarchaea 
have atypical features. Specifically, one of the TatC 
homologs has an extended cytoplasmic N terminus, 
while its paralog contains 14 predicted TM segments, 
in which the sequence resembles two independent 
TatC proteins fused by 2 TMs (13, 21). These modi- 
fications of the TatC components have only been 
identified in the haloarchaea and may represent a 
functional adaptation to either high salt or increased 
usage of the Tat pathway in this group of archaea. 


PERSPECTIVE: THE NEXT FIVE YEARS 


Recent in vivo, in vitro, and in silico studies have 
led to a better understanding of archaeal protein 
translocation. Moreover, the elucidation of an ar- 
chaeal Sec-pore X-ray crystal structure strikingly 
demonstrates how analysis of this pathway in archaea 
can significantly advance the field of protein translo- 
cation as a whole. In addition to standard molecular 
and biochemical approaches, it is now crucial to de- 
velop in vitro Sec and Tat protein translocation sys- 
tems that will more clearly define the mechanisms of 
these pathways and reveal the energetics of these cel- 
lular processes in archaea. Moreover, with the grow- 
ing number of auxotrophic strains, selectable markers, 
recombinatory techniques, and self-replicating vec- 
tors, the future use of genetic selections should help 
to identify potential archaea-specific aspects of protein 
translocation (6). Such genetic selections may also re- 
veal translocation components conserved in bacteria 
and/or eucarya that have not yet been identified. 
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Chapter 18 


Flagellation and Chemotaxis 


KEN F. JARRELL, SANDY Y. M. NG, AND BONNIE CHABAN 


INTRODUCTION 


Flagellation is a characteristic that is found widely 
throughout the Archaea (Fig. 1) (45, 111). While the 
gross observations of a rotating structure with a hook 
responsible for swimming motility indicate an affinity 
of the archaeal flagellum with the bacterial flagellum, 
biochemical, genetic, and ultrastructural evidence has 
accumulated over the years indicating that the ar- 
chaeal flagellum is a unique motility structure. It is 
distinct from the bacterial flagellum but with several 
similarities to another bacterial motility structure, 
namely the type IV pilus, an organelle responsible for 
a type of surface translocation termed twitching (11, 
12, 23, 84, 111). The presence of flagella on archaea 
inhabiting extremes of temperature, salinity, and pH 
(Fig. 2) indicate a remarkable stability of the struc- 
ture. Novel posttranslational processing of the fla- 
gellins (unusual type IV pilin-like signal peptides, 
novel N-linked glycosylation) (5, 8, 9, 13, 24, 116, 
118) is known, and insights into the assembly of the 
flagellins are slowly emerging (see Chapter 11), aided 
by the continuous improvement in the genetic tools 
for various archaea (see Chapter 21). Studies of the 
preflagellin processing have also aided our under- 
standing of protein export in these organisms (7, 86, 
89) (see Chapter 17). 

One of the better-studied aspects of archaea 
physiology is the understanding of various types of 
taxis (phototaxis, chemotaxis), especially in halobac- 
teria (64, 90, 107). While the components and mole- 
cular mechanisms are similar in bacteria and archaea, 
novel findings in archaea have been reported (56, 64). 
Both the bacterial and archaeal sensory systems have 
a two-component signaling mechanism at their heart. 
However, while the response regulator, CheY-P, is 
known to bind to the switch component FliM in bac- 
teria (107), a functionally equivalent protein to FliM 


has yet to be identified in archaea (29, 79). Indeed, 
the method of linkage of the unusual archaeal motil- 
ity organelle with the bacterial-like chemotaxis sys- 
tem remains a compelling mystery. Continued study 
of this fascinating motility structure and associated 
sensing system will no doubt add to our knowledge of 
archaeal physiology and genetics well beyond the lim- 
ited area of bacterial and archaeal movement. 


ULTRASTRUCTURE 


Phenomenologically, the archaeal flagellum re- 
sembles the bacterial flagellum in being a rotating 
structure responsible for swimming motility. How- 
ever, the absence of detectable homologs in com- 
pletely sequenced genomes from any archaeon of 
genes involved in bacterial flagella structure, function, 
and assembly indicate that the archaeal organelle is 
likely composed of archaeal-specific gene products 
(29). Electron microscopic examination of flagella 
from numerous diverse archaea indicates several com- 
mon features. Archaeal flagella are typically much 
thinner than their bacterial counterparts, usually 10 
to 13 nm (17, 27, 33, 46, 99), compared with bacterial 
flagellum diameters that are typically about 20 nm 
(48). In the case of Halobacterium salinarum fila- 
ments, there are ~3.3 subunits per turn of a ~19-A 
pitch left-handed helix compared with ~5.5 subunits 
per turn of an approximately 26-A pitch right-handed 
helix for plain bacterial filaments (23). H. salinarum 
cells alternate between forward and reverse swim- 
ming due to clockwise (CW) and counterclockwise 
(CCW) rotation of the right-handed flagella bundle, 
respectively (3). CW rotation of the halobacterial fla- 
gella exerts a pushing force, propelling the cell for- 
ward, while CCW rotation results in the flagella 
pulling the cell, so that the cell swims with the flagella 
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Figure 1. Distribution of flagellation throughout the domain Archaea. +, at least some species of the genus are flagellated; 
—, no members of the genus are reported to be flagellated. Reproduced with modifications from FEMS Microbiology Re- 


views (111). 


at the front (54). This is unlike most bacteria, such 
as Escherichia coli, which alternate between swim- 
ming (CCW rotation of the left-handed flagella bun- 
dle) and tumbling (CW rotation). Also unlike most 
bacterial flagella, the archaeal flagella filaments are 
composed of multiple different flagellins (with the ap- 
parent exception of Sulfolobus species) (5), which are 
usually, perhaps universally, posttranslationally mod- 
ified. N-Linked glycosylation structures attached to 
flagellins have been determined for H. salinarum (60, 
118) and Methanococcus voltae (116). In addition, 
there is indirect evidence from specific staining (28, 
32, 51, 87, 96, 99) and proteomic analyses (38, 74) of 
further, as yet uncharacterized, modifications on the 
flagellins from other archaeal species. The core struc- 
ture of the flagella from H. salinarum is similar to 
that of type IV pili (23, 115), and in sharp contrast 
to bacterial flagella, the archaeal filament has no de- 
tectable central cavity (23). This observation implies 
that the assembly of archaeal flagella is distinct from 
that of bacterial flagella and is consistent with a long- 
standing hypothesis (36, 45). Flagella dissociation ex- 
periments in Natrialba magadii indicate that flagella 
filaments are composed of thinner filaments termed 
protofilaments and that the flagella can be untwined 
into protofilaments upon a decrease in NaCl concen- 


tration (33). Immunolabeling experiments indicated 
that the N. magadii flagella are composed of several 
longitudinal rows of protofilaments, each consisting of 
one flagellin type (88). Cross-sectional analysis of the 
density map of H. salinarum flagella indicates the 
presence of 10 protofilaments (Fig. 3) (23). 


Anchoring Structure 


A cell-proximal hook region has been detected in 
stained samples of various archaeal flagella prepara- 
tions. Staining with phosphotungstic acid routinely re- 
sults in better resolution of the hook region than stain- 
ing with uranyl acetate (10, 27, 52). Compared with 
the defined hook length in bacteria, archaeal hooks 
vary considerably in length and are observed to be 
longer, in general (10, 27). In addition, the junction be- 
tween the filament and hook in archaeal flagella is in- 
distinguishable, in contrast to the case in bacterial fla- 
gella (10). In M. voltae, there is a large enrichment in 
flagellin FlaB3 in flagella stub samples. These prepara- 
tions are enriched in cell-proximal filament fragments 
(10), suggesting that in archaea the hook region may 
be composed primarily of one of the multiple flagellins. 

The anchoring structure or basal body equivalent 
of archaeal flagella has proven to be difficult to visu- 
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Figure 2. Electron micrographs depicting flagella on a diverse range of archaea. (A) Methanococcus voltae, negative stain, bar 
= 1 pm. Reproduced from the Journal of Bacteriology (10) with permission of the publisher. (B) Archaeoglobus veneficus, plat- 
inum shadowed, bar = 1 pm. Picture courtesy of Reinhard Rachel. (C) Sulfolobus tengchongensis, negative stain, bar = 
1 pm. Picture courtesy of Li Huang. (D) Pyrococcus furiosus, platinum shadowed, bar = 1 pm. Picture courtesy of Reinhard 
Rachel. (E) Halobacterium salinarum, carbon/platinum shadow. Reproduced from the Journal of Molecular Biology (3) with 


permission of the publisher and D. Oesterhelt. 


alize. This may be complicated by the novel cell en- 
velopes of archaea, which lack murein. Indeed, in the 
best-studied cases, flagella have been isolated from or- 
ganisms (Methanococcus, Halobacterium) that have 
an extremely simple envelope structure with a protein 
or glycoprotein S layer overlying a cytoplasmic mem- 
brane—an envelope structure that is simple in com- 
parison with bacteria (57). This simple wall structure 
may preclude the presence of the typical basal body 
rings that are associated with the peptidoglycan layer 
and the outer membrane in bacteria flagella. In ar- 
chaea, such rings have, on occasions, been reported 
(Methanococcus thermolithotrophicus, Methanospir- 
illum hungatei) (27), although more typically archaeal 
flagella have an ill-defined knob at their cell proximal 
end (10, 47, 52, 59). It is unclear whether these ob- 
servations indicate that the anchoring lacks rings typ- 
ical of bacterial basal bodies, or whether the archaeal 


equivalents are simply more delicate structures that 
are lost upon flagella preparation. 

It is apparent from several lines of evidence that 
further anchoring of the flagella of archaea occurs by 
a structure that may be located beneath the cytoplas- 
mic membrane. This was predicted more than 20 years 
ago (3), based on electron microscopy of halobacterial 
cells. Kupper et al. (59) were able to isolate halobac- 
terial flagella inserted into a polar cap structure, 
where the filaments of an entire flagella bundle were 
attached. These authors could not determine whether 
the polar cap was part of the wall, membrane, or sub- 
cytoplasmic membrane. However, thin sections of 
many archaea, such as M. voltae (46, 58) and H. sali- 
narum (69, 100), have revealed the presence of polar 
membrane-like structures. In addition, a second dis- 
tinct, disk-like lamellar structure has been detected be- 
low the cytoplasmic membrane in H. salinarum (69, 
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Figure 3. Density map of the H. salinarum flagellar filament. (a) Cross-section through the three-dimensional density map of 
a H. salinarum flagellum. The triskelion-like shape and lack of a central channel is shown. (b) An axial projection of a stack 
of ten sections as shown in (a) spaced 5.8 A apart and rotated by 108° relative to each other. Reproduced from the Journal of 
Molecular Biology (23) with permission of the publisher and S. Trachtenberg. 


100). This complex structure is always located at the 
cell pole exactly where the flagellar bundle is located 
and lies about 20 nm beneath the cytoplasmic mem- 
brane (69). 


ARCHAEAL FLAGELLA GENES 


One genetic locus, containing up to 13 flagella- 
related genes, is present in the genome sequences of 
the majority of flagellated archaea (Table 1). This 
locus normally contains multiple flagellin genes 
arranged in tandem, followed by a number of con- 
served flagella-associated genes (flaC-flaJ or a subset 
thereof; Fig. 4). A gene for the preflagellin peptidase is 
often present outside of this cluster (8, 9). In M. 
voltae, genes involved in the transfer of a novel N- 
linked glycan to the flagellins have recently been iden- 
tified, separate from the main fla cluster, although 
their role is not limited to flagellin modification (21a). 


Flagellins 


The best-studied genes in the archaeal flagella 
loci are the flagellin genes. Among the 25 completely 
sequenced genomes available at the NCBI website 
(http://www.ncbi.nlm.nih.gov/), 19 have readily iden- 
tified flagellin genes. These include three species of 
Methanosarcina, which have never been reported to 


be flagellated or motile, although another Methano- 
sarcina species, M. baltica, can be flagellated. On the 
other hand, two species reported to have flagella and/ 
or be motile (Pyrobaculum aerophilum and Metha- 
nopyrus kandleri) do not have any genes currently 
annotated as flagellins. Examination of the M. kand- 
leri genome does reveal the presence of genes that en- 
code several flagellin-like proteins, located very close 
to apparent flaHI] homologs (K. F. Jarrell, S. Y. M. 
Ng, and B. Chaban, unpublished results). The pre- 
dicted proteins encoded by these genes have short sig- 
nal peptides ending in Lys-Gly, followed by a hydro- 
phobic stretch, typical of archaeal flagellins. Between 
these small genes lie two genes predicted to be prepilin 
peptidase homologs. However, these flagellin-like 
proteins are very small, less than 80 amino acids. 
Whether these are the proteins responsible for the ap- 
pendages observed on M. kandleri cells remains to be 
determined. 

Flagellin genes are present as multigene families 
in all archaea, with the exception of Sulfolobus species, 
where only a single flagellin gene is present. Two to six 
flagellin genes are usually present in tandem. How- 
ever, in some archaea, including H. salinarum, one 
group is present at a separate locus on the chromo- 
some from another tandem set (36, 77). In Haloarcula 
marismortui, one flagellin gene is located on a plasmid 
(TIGR gene annotation NT01HMO0026; http://www. 
tigr.org/). The flagellin genes are known to be cotran- 
scribed with the downstream fla genes in the several 
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Table 1. Distribution of fla gene clusters in archaea” 


fla genes 
Organism Flagellin 
C D E F G H I J K 
Crenarchaeota 
Aeropyrum pernix 14601712 - - - - 14601709 14601707 14601706 14601705 - 
14601713 14601708 14601914 
14600901 
Pyrobaculum aerophilum - - - - - - - 18313579 18313580 - 
18312170 
18313110 
Sulfolobus acidocaldarius 70606948 - - - 70606945 70606946 70606944 70606943 70606942 70605985 
70608025 
70607238 
Sulfolobus solfataricus 15899081 = = = 15899078 15897078 15899077 15899076 15899075 15897089 
Sulfolobus tokodaii 15922853 = = = 15922856 15922855 15922857 15922858 15922859 15922589 
15920739 
1521686 
Euryarchaeota 
Archaeoglobus fulgidus 11498659 = 11498658 11498656 11498657 11498655 11498654 11498653 11498541 
11498660 
Ferroplasma acidarmanus - - - - - - - - - = 
Haloarcula marismortui 55376132 55378885 55378884 55378883 55378882 55378881 55378880 55379184 
55380089 55378048 55378047 
55378891 
Halobacterium salinarum 15790079 15790076 15790074 16554480 16554479 15790072 15790071 15791095 
15790080 15790075 15789975 
15790081 15789521 
15790120 
15790121 
15790252 
Methanocaldococcus jannaschii 15669081 15669084 15669085 15669086 15669087 15669088 15669089 15669090 15669091 15669092 
15669082 1591924 1591923 
15669083 1591480 
Methanococcoides burtonii 68185311 68186332 68186333 68186334 68186336 68186337 68184794 
68185312 68185854 68185307 68185306 68185305 
68185254 
Methanococcus maripaludis 45359229 45359232 45359233 45359234 45359235 45359236 45359237 45359238 45359239 45358118 
45359230 
45359231 
Methanopyrus kandleri - - - - - - - 20094144 19887149 20094140 
Methanosarcina acetivorans 20091879 - 20091878 20091876 20091877 20091875 20091874 20091873 20089408 
20091880 20091896 20091898 20091897 20091899 20091900 20091901 
20091895 20089132 
20091914 
20092769 


Continued on following page 
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Table 1. Continued 


fla genes 
Organism Flagellin 
C F G H I J K 
Methanosarcina barkeri 68080180 = 68080179 68080177 68080178 68080176 68080175 68080174 68133328 
68081047 
Methanosarcina mazei 21226424 - 21226423 21226421 21226422 21226420 21226419 21226418 21227780 
21226425 21226519 21226517 21226518 21227041 21226515 21227039 
21226520 21226516 21227618 21226514 
21227040 
21228400 
Methanothermobacter - - - - - 2622833 2622834 2621485 
thermautotrophicus 
Nanoarchaeum equitans - - - - - 41614965 - - 
41615211 
Picrophilus torridus - - - - - - - - 
Pyrococcus abyssi 14521692 14521691 14521690 14521689 14521688 14521687 14521686 14521685 14521786 
14521693 14521589 14521587 
14521694 
Pyrococcus furiosus 18976709 18976708 18976707 18976706 18976705 18976704 18976703 18976702 18976843 
18976710 18977366 18977364 
Pyrococcus horikoshii 14590447 14590452 14590453 33359301 14590455 14590456 14590457 14590458 14590360 
14590448 14590539 14590540 
14590449 14590541 
14590450 
14590451 
Thermococcus kodakaraensis 57639973 57639978 57639979 57639980 57639981 57639982 57639983 57639984 57639988 
57639974 57639935 
57639975 57641788 
57639976 
57639977 
Thermoplasma acidophilum 16082620 16081659 16081660 16081661 16082646 16081662 16081663 16081664 - 
16081658 16081953 16081951 
16081950 
Thermoplasma volcanium 13542257 13541439 13541440 13541441 13541442 13541443 13541444 13541445 = 
13541438 13541846 13541844 
13541843 


“GI numbers are for proteins from genome sequences at NCBI Genomes (www.ncbi.nlm.nih.gov/genomes/static/a.html). 
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fla clusters in Archaea 
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Figure 4. Flagella gene families in selected archaeal species. Similar shadings indicate homologs shared among families. Genes are 
transcribed in the direction of the arrows. In H. salinarum, the B flagellin genes are adjacent to the accessory genes while the 
A flagellin genes are located elsewhere on the chromosome. One of the flagellin genes of H. marismortui is located on a plasmid. 


instances where this has been studied (various 
methanococci [50, 113], Thermococcus (formerly Py- 
rococcus) kodakaraensis [75]). In H. salinarum, the 
flagella-associated fla genes are present next to tandem 
flagellin genes. However, the orientation of the two 
gene clusters is inverted, thereby precluding cotran- 
scription, though not necessarily coregulation. 
Proteins from all flagellin genes are predicted to 
be made with short signal peptides, usually ending 
with a basic amino acid and a glycine, similar to type 
IV pilins (111). Typical signal peptides of about 
12 amino acids are found in many flagellins, espe- 
cially from halophiles and methanogens, while even 
shorter ones of 4 amino acids are predicted for fla- 
gellins from Pyrococcus sp. Analysis of the deduced 
amino acid sequence also indicates the presence of N- 
linked glycosylation sequons (Asn-Xaa-Ser/Thr) in 
the vast majority of archaeal flagellins (76). The num- 
ber of such sequons ranges up to 16 for one flagellin 
of M. thermolithotrophicus. The presence of glyco- 
sylation is believed to be widespread among archaeal 


flagellins and may account for most of the discrepan- 
cies observed between the predicted molecular weights 
and that observed in sodium dodecyl sulfate-poly- 
acrylamide gel electrophoresis (SDS-PAGE) analyses 
(15, 45, 104, 111). The N termini of archaeal fla- 
gellins is very hydrophobic (50), and there is consid- 
erable similarity in sequence to the conserved N ter- 
mini of type IV pilins in bacteria (30). 

Evidence from mutagenesis studies in several ar- 
chaea has indicated that the flagellins are not inter- 
changeable. In M. voltae, insertional inactivation of 
the minor flagellin, flaA, resulted in cells that looked 
to be normally flagellated. However, they were not as 
motile as wild-type cells in semi-swarm plate analysis 
(44). In H. salinarum, the two flagellin gene clusters 
that encode FlgA and FlgB flagellin proteins are tran- 
scribed separately. Each gene cluster has been inser- 
tionally inactivated, and the ensuing effect on flagel- 
lation has been studied (110). The flgA mutants were 
flagellated but less motile than the wild-type cells. The 
flagella themselves were curved and spiral, similar to 
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the wild type. However, the flagella of the mutant cells 
were shortened and located at both the poles and sides 
of the cells. In addition, a mutant that had an insertion 
in flgA2 only had straight flagella that were located 
mainly on the cell’s poles. Mutants that had disrupted 
flgB genes were also less motile than the wild-type 
cells and had both spiral-shaped flagella at the poles of 
the cells and outgrowths from the cell surface. These 
outgrowths are believed to be membrane sacs filled 
with basal body-like structures. Some of these out- 
growths were observed to have a flagellum at their 
ends (110). From these data it was interpreted that the 
A flagellins form the main component of the filament 
and are incorporated at the initial stages of assembly. 
The less abundant B flagellins are most likely located 
near the base. The phenotype of the flgB mutants with 
outgrowths to which flagella may be attached suggest 
a role for the B flagellins in proper anchoring of the 
flagella structure. In the absence of the A flagellins, it 
appears that improper incorporation of the B flagellins 
can occur, leading to truncated filaments not located 
at the poles. 


Preflagellin Peptidase Genes 


The enzyme responsible for cleavage of the signal 
peptide from the archaeal flagellins was identified as 
FlaK, the preflagellin peptidase in methanococci (8, 9) 
and as PibD (peptidase involved in biogenesis of 
prepilin-like proteins) in Sulfolobus (5). PibD pro- 
cesses several nonflagella-related proteins involved in 
sugar binding and therefore appears to have broader 
substrate specificity than the methanococcal FlaK en- 
zymes (5). FlaK is usually located distant from the fla- 
gella gene locus, although in M. jannaschii it follows 
fla] and is likely to be cotranscribed. FlaK/PibD is a 
member of the same COG (1989: www.ncbi.nlm.nih 
.gov/COG/) that includes the bacterial prepilin pepti- 
dases, such as PilD of Pseudomonas aeruginosa, even 
though the amino acid sequence similarity of the bac- 
terial and archaeal enzymes is very low. An in vitro 
assay was developed that allowed for the identification 
of key amino acids in the signal peptide required for 
proper processing (5, 112). In vitro analysis revealed 
that the —1 glycine and a basic amino acid at the —2 
position are extremely important for proper process- 
ing of the preflagellin. Replacement of this conserved 
glycine with alanine by site-directed mutagenesis still 
allowed processing of the archaeal preflagellin; this 
replacement also allows processing of the type IV 
prepilin in the bacterial system. The in vitro assay was 
also used to show that two conserved aspartic acid 
residues were critical for the activity of the peptidase. 
This clearly identifies the archaeal peptidase as a 
member of the aspartic acid peptidases, a family that 


includes the prepilin peptidases (9). Disruption of the 
flaK gene in M. voltae leads to larger-molecular-weight 
flagellins being produced, consistent with retention of 
the signal peptides. Furthermore, the mutant cells are 
not flagellated, indicating that the flagellins need to be 
N-terminally processed to be assembled into a flagel- 
lar filament (9). 


Other fla Genes 


Among the fla-associated genes, the most con- 
served are the flaHIJ clusters. The genes are present in 
the genome sequences of all archaea that possess fla- 
gellin genes and are believed to be involved in the ex- 
port of the flagellins. Similar to flaK, flal and fla] have 
homologs in the bacterial type IV pili system. Flal 
contains a Walker box involved in NTP binding (14), 
and the similarity of FlaI to ATPases involved in type 
IV pili extension and retraction has been recognized 
for many years (PilT/PilB in P. aeruginosa [14]; TadA 
in Actinobacillus actinomycetemcomitans [49]). Re- 
cently, Albers and Driessen (4) demonstrated that 
Flal from Sulfolobus solfataricus possesses divalent 
cation-dependent ATPase activity. Whether the ar- 
chaeal FlaI forms homohexamers similar to the bac- 
terial PilT (34), and likely for BfpF (the PilT equiva- 
lent in the bundle forming pilus system [26]), is still 
to be determined. FlaJ is an integral membrane pro- 
tein with numerous predicted transmembrane do- 
mains, and its similarity to the conserved membrane 
PilC/TadB component of type IV pili has been re- 
ported (49, 84). Since PilC may interact with the cor- 
responding ATPase in the type IV pilus system (6), it 
may be that in the archaeal flagella system, FlaI and 
FlaJ interact. FlaH also contains a Walker Box A 
(113), but it does not have strong similarity to any 
bacterial proteins. 

In a comprehensive study of secretion ATPases in 
Sulfolobus, it was suggested that FlaI-FlaJ might form 
the minimal core of the secretion systems (4). How- 
ever, the universal association of FlaH, also a poten- 
tial ATPase (4, 113) with FlaI and FlaJ in the archaeal 
flagella operons, suggests that it too is likely to be a 
required component for the export/assembly system 
of this organelle. Some archaea contain two or more 
copies of some, or all, of the flaHI] genes. The set lo- 
cated in the immediate vicinity of the flagellins genes 
may be dedicated solely to the export of the flagellins, 
while the other set may be involved in export of other 
substrates. It has been shown in both H. salinarum 
(83) and in M. voltae (114) that the flaI and fla] genes 
in the vicinity of the flagellins are required for flagel- 
lation, since either inframe deletions of flal in H. sali- 
narum (83) or insertional inactivation into fla] in M. 
voltae (114) results in nonflagellated cells. In these 
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two cases it is clear that the presence of other flal or 
fla] homologs elsewhere on the chromosome did not 
compensate for disruptions of the flaI and fla] genes 
located at the flagella locus. 

The involvement of the flaC-flaG genes in ar- 
chaeal motility is much more of a mystery. Because 
of their location between the required flagellin genes 
and flaHIJ genes and their cotranscription with these 
genes in at least some archaea (50, 113), they have 
been assumed to be involved in archaeal flagellation. 
In H. salinarum, the orientation of the flagella-associ- 
ated genes is opposite to that of neighboring flagellin 
genes, illustrating that they are not cotranscribed with 
flagellin genes in this species, although coregulation is 
still possible. While all of these genes have homologs 
in most of the flagellated Euryarchaeota, including 
members of methanococci, pyrococci, halophiles, and 
Thermoplasma species, in Crenarchaeota, no similar- 
ity, or only weak similarity exists to these genes. In- 
deed, in most members of the Crenarchaeota, the 
number of putative open reading frames (ORFs) lo- 
cated between the flagellin genes and the flaHIJ set 
simply does not leave room for all of flaC-F (76), 
clearly indicating that different subsets of the flagella- 
associated genes are present in different organisms. 
Homologs to flaC-F have not been identified in bac- 
teria, and they therefore encode archaea-specific pro- 
teins involved in flagella structure, function, or as- 
sembly. None of the genes have been reported in 
archaea that do not also have flagellin genes (76). 

Some interesting observations have been made 
about the flaC-F genes. The flaD gene has been 
shown to encode two proteins in M. voltae; one full- 
length gene product of about 52 kDa and a smaller 
15-kDa protein that initiates at an inframe methion- 
ine downstream of a strong ribosome binding site 
(rbs) toward the 3’ end of the gene. Both versions of 
FlaD are observed when the gene is expressed in 
Escherichia coli, and both versions are detected in 
M. voltae cells by using anti-FlaD antisera (113). The 
truncated FlaD is very similar to the full length of 
FlaE (42.7% identity over a 127-amino-acid overlap). 
A potential start codon downstream of a strong rbs at 
a position similar to that of the M. voltae flaD was 
identified in other methanococci, suggesting that 
these archaea also make two proteins from the single 
gene (113). A 16-kDa C-terminal portion of FlaD was 
detected in a proteomic study of M. jannaschii (74), 
providing proof that the translation of flaD in this hy- 
perthermophile was similar to that observed in 
M. voltae. Furthermore, western blots using anti- 
M. voltae FlaD revealed two cross-reacting bands in 
Methanococcus maripaludis membrane fractions, at 
similar molecular masses to those observed in 
M. voltae (S. Y. M. Ng and K. F. Jarrell, unpublished 


data). Studies are currently underway to address the 
role, if any, that this truncated FlaD has in flagellation. 

Another observation suggests that the FlaC, 
FlaD, and FlaE proteins might interact. In several ar- 
chaea, notably H. salinarum and Methanococcoides 
burtonii, there is one large gene with similarity to 
flaC, flaD, and flaE from Methanococcus spp. In 
H. salinarum, this protein is predicted to be 504 
amino acids long, and it aligns with the M. voltae 
FlaC (188 amino acids) over amino acids 30 to 233, 
and with the M. voltae FlaD (362 amino acids) over 
the remainder of its length. Because of the strong se- 
quence similarity between FlaD and FlaE, the halophile 
protein also aligns with FlaE from amino acid position 
380 to the end of the protein. If this one large protein 
is fulfilling the roles of FlaC, FlaD, and FlaE, it could 
indicate that in those archaea that have the three sep- 
arate proteins, that these proteins may interact with 
each other. Using a recently developed method to cre- 
ate markerless inframe deletions in Methanococcus 
maripaludis (72), the requirement for, and possible role 
of, each of these fla-associated genes in archaeal fla- 
gellation is currently being studied (B. Chaban, S. Y. M. 
Ng, and K. F Jarrell, unpublished data). 


Glycosylation Genes 


A regular finding in many, if not all, archaeal fla- 
gellins is an aberrantly high apparent molecular mass 
determined by SDS-PAGE compared with the pre- 
dicted mass from the gene sequence (111). In fact, this 
finding is so common that posttranslational modifica- 
tion of some sort may be the rule for the flagellins 
from these organisms. In many cases, this modifica- 
tion has been shown either directly or indirectly 
through specific staining to be glycosylation. In gen- 
eral, glycans can be attached to proteins by either an 
N linkage or an O linkage (see Chapter 11). N link- 
ages involve glycan attachment to an Asn residue 
within the consensus sequence Asn-Xaa-Ser/Thr, 
where Xaa can be any amino acid except Pro. This 
sequon is regarded as necessary but not sufficient for 
N-glycosylation in eucarya (35) as well as in bacteria 
(78). On the other hand, the attachment sites for O 
linkages are less clearly defined, with the glycan co- 
valently linked to a Ser or Thr residue. Thus far, gly- 
cans found on archaeal flagellins have all been of the 
N-linkage type (104, 116). This contrasts sharply 
with glycans that have been detected on both bacter- 
ial flagella and type IV pili, which to date have all 
been reported to be O linked (109). 

From archaeal flagellins, N-linked glycosylation 
structures have been determined in H. salinarum (104, 
118) and M. voltae (116); preliminary evidence for 
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modification of flagellins has been presented for M. 
jannaschii (38). The H. salinarum glycan is a sulfated 
oligosaccharide linked to Asn residues via glucose fol- 
lowed by 2 or 3 glucuronic acids, with approximately 
one-third of these replaced by iduronic acid. Further- 
more, a sulfate residue is attached to each hexuronic 
acid at position 3 and the terminal position has been 
shown to contain some variability, being occupied by 
either a hexuronic acid or a glucose residue (104, 
105). In comparison, the M. voltae glycan appears to 
be less variable. The same trisaccharide structure was 
detected a total of 14 times among the four M. voltae 
flagellins. Linked to Asn via N-acetylglucosamine, 
the structure is followed by a di-N-acetylglucuronic 
acid and finally an N-acetylated mannuronic acid 
with a threonine attached at position 6 (116). In both 
cases, the same glycan detected on the flagellins was 
also found on S layer proteins within the cells, sug- 
gesting that a common glycosylation pathway exists 
for the posttranslational modification of both types 
of proteins. 

For the first time, genes involved in the glyco- 
sylation of archaeal flagellins have been identified in 
M. voltae (21a). Using SDS-PAGE, it was found that 
mutations that altered the M. voltae glycan caused 
corresponding alterations in the apparent molecular 
mass of both the flagellins and S layer protein. Two 
genes critical for glycosylation were identified. One is 
a homolog of the oligosaccharyltransferase STT3 sub- 
unit. Homologs of this gene are found among the pgl 
genes (pg/B) in bacteria such as Campylobacter, and 
as part of a complex of proteins collectively known 
as the oligosaccharyltransferase (OTase) in eukarya, 
where they are critical in N-linked glycosylation sys- 
tems (21, 108, 109). The STT3 homologs are respon- 
sible for the transfer of the completed glycan from a 
lipid carrier to the protein target in a reaction that oc- 
curs on the periplasmic side of the cytoplasmic mem- 
brane in bacteria or the lumenal side of the endoplas- 
mic reticulum in eukarya (109). It is hypothesized that 
the STT3 protein in archaea behaves similarly to its 
bacterial counterpart, being membrane bound with its 
activity on the periplasmic side of the cytoplasmic mem- 
brane. Searches of archaeal genomes reveal the pres- 
ence of STT3 homologs in most archaea, including 
nonflagellated ones. The second gene demonstrated to 
be necessary for glycosylation in M. voltae is one that 
has strong similarity to glycosyltransferases in many 
bacteria, such as Thermus and Pseudomonas. In 
M. voltae, this gene product appears to be involved 
in the transfer of the third, terminal sugar to the gly- 
can structure of the flagellins and S layer protein 
(21a). The M. voltae glycosyltransferase is also pre- 
dicted to be membrane bound (www.ch.embnet.org/soft 
ware/TMPRED_form.html), with its active site on the 


cytoplasmic side of the cytoplasmic membrane. Genes 
with significant similarity to this gene are also found in 
a variety of archaea, again including nonflagellated 
species. As with the STT3, this glycosyltransferase is 
likely to be involved in general glycan synthesis not 
specific to flagellins and appears to play a role in post- 
translational modification of the S layer protein and 
perhaps other cell surface proteins. 

Given the similarities between the genes involved 
in archaeal and bacterial N-glycosylation, the current 
model for glycan assembly and attachment in archaea 
is based on the bacterial model. For bacterial N- 
glycosylation, the glycan is proposed to be assembled 
on the interior side of the cytoplasmic membrane 
through the sequential addition of nucleotide- 
activated sugars onto a lipid carrier, followed by a 
“flipping” of the glycan to the periplasmic side of the 
membrane and finally a transferring en block to the 
target protein (108, 109). At the present time, the 
same sequence of events is proposed for archaeal N- 
linked glycosylation and research is continuing to 
identify further genes involved in this pathway in 
methanococci. 


REGULATION OF FLAGELLATION 


Many archaea have been observed to possess fla- 
gella under certain growth conditions and be nonfla- 
gellated if those growth conditions vary. Growth at 
suboptimal temperature often results in nonflagel- 
lated cells: this has been observed with Methanospir- 
illum hungatei (31) and M. jannaschii (Jarrell, Ng, 
and Chaban, unpublished). However, M. maripaludis 
strain LL is more heavily flagellated at 30°C than at 
37°C in Balch HI medium (K. F. Jarrell, S. Y. M. Ng, 
and B. Chaban, unpublished observation).The regu- 
lation in M. hungatei appeared to be at the level of 
subunit assembly, as similar levels of flagellins were 
detected by western blotting in both flagellated and 
nonflagellated cells. 

More recently, the regulation of flagellation in 
M. jannaschii has been studied at the proteomic level. 
Significant changes in the flagellins and other proteins 
believed to be critical for flagellation were observed in 
response to decreased H, availability, limited ammo- 
nium availability, and stage of growth (38, 74). In 
addition, the flagellins appear to be modified with an 
undetermined modification(s), and the protein vari- 
ants (in pI and molecular weight) vary with growth 
conditions (38). Growth under limited ammonium re- 
sulted in only minimal amounts of flagellins FlaB1 
and FlaB2. An earlier study revealed that flagellins 
FlaB2 and FlaB3, as well as flagella-associated pro- 
teins, FlaD and FlaE, were present in low amounts 
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when the cells were grown in hydrogen excess condi- 
tions and the cells lacked flagella. Flagella synthesis 
occurred when hydrogen became limited. This repre- 
sented the first case of flagella regulation by hydrogen 
in any domain of life. This may be relevant to M. jan- 
naschii in its natural habitat of deep-sea hydrother- 
mal vents. It has been postulated that a 10-base direct 
repeat GTGTGGGGGAattGTGTGGGGGA located 
183 nucleotides downstream of the putative promoter 
and 49 nucleotides upstream of the presumed trans- 
lation start site may be a possible regulatory element 
(113). Recent transcriptome analysis of the mesophilic 
M. maripaludis grown under hydrogen-limited condi- 
tions in a chemostat indicated genes for flagella syn- 
thesis were upregulated under such conditions (E. L. 
Hendrickson, A. K. Haydock, I. Porat, W. B. Whit- 
man, and J. A. Leigh, 105th ASM General Meeting, 
abstract K-086, 2005), consistent with the observa- 
tions made on the hyperthermophilic M. jannaschii. 


ARCHAEAL FLAGELLA, BACTERIAL 
FLAGELLA, AND TYPE IV PILI 


Bacteria and archaea have developed many very 
different mechanisms that enable them to move in 
their environments (12). Bacterial and archaeal fla- 
gella are responsible for swimming, while type IV pili 
are involved in surface translocation or twitching. A 
comparison of archaeal flagella to bacterial flagella 
and type IV pili is presented in Table 2. In gross ap- 
pearance and performance, the archaeal flagellum re- 
sembles the bacterial flagellum but, at the genetic 
level, there appears to be no conserved genes between 
these two organelles. On the other hand, several con- 
served proteins are present in the archaeal flagellum 
and type IV pilus systems (11, 14, 84, 111). The no- 
tion that the archaeal flagellum shared similarities to 
the type IV pilus began with the observation that ar- 
chaeal flagellins and type IV pilins possess significant 


Table 2. Comparison of archaeal flagella to bacterial flagella and type IV pili 


Bacterial flagella 


Archaeal flagella 


Type IV pili 


e Usually single flagellin with 
a few exceptions 


e No sequence similarities between 
bacterial and archaeal flagellins 

e Flagellins rarely postranslationally 
modified 

e If glycosylated, O-linked 

e Flagellins do not have N-terminal 
leader peptides 


e No archaeal homologues of genes 
involved in bacterial flagellation 
(flagellins, rod, hook, hook- 


associated, ring, switch, mot) 


e Rod and ring containing basal body 
e Hook region 
e Well-defined hook region length 
of about 60 nm 
e Filament is usually a left-hand helix 


e ~20 nm diameter 

e 2 nm channel allowing for passage 
of flagellin subunits 

e Swimming motility 

e Rotating filament 


e Switching of rotation 
e Swim/tumble motion 


e Growth at distal end 


e Associated chemotaxis system 


e Usually multiple flagellins with 
rare exception 


e Flagellins posttranslationally modified, 


usually through glycosylation 

e Glycosylation N-linked 

e Flagellins have N-terminal signal 
peptides 

e Cleaved by FlaK/PibB 

e No archaeal homologues of genes 
involved in bacterial flagellation 
(flagellins, rod, hook, hook- 


associated, ring, switch, mot) 


e Knoblike anchoring structure 

e Curved hooklike region 

e Great variability in hook length 
(72-265 nm) 

e Filament is a right-hand helix in 
H. salinarum 

e ~10-15 nm diameter 

e Channel not detected 


e Swimming motility 
e Rotating filament 


e Switching of rotation 

e Push/pull motion 

e Presumed growth at proximal 
end (base) 


e Associated chemotaxis system 


e Single major pilin 


e Several minor pilins 

e N-terminal sequence similarities 
with archaeal flagellins 

e Pilins may be glycosylated or phosphoryl- 
ated depending on the species 

e Glycosylation O-linked 

e Prepilins have N-terminal signal peptides 


e Cleaved by PilD 

e Flal is homologous to PilB, PilT, and TadA, 
ATPases involved in the assembly/ 
disassembly process of type IV pili 

e FlaJ has sequence similarity with TadB and 
PilC, integral membrane proteins involved 
in type IV pili biogenesis 

e No anchoring structure observed 

e No hook 


e ~5-7 nm diameter 

e Channel too narrow for proteins to pass 
through 

e Surface translocation or twitching motility 

e Movement by pilus retraction/elongation 
at base 


e Growth at proximal end (base) 


e Often associated chemotaxis system 
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amino acid sequence similarity at their N termini 
(30). This very hydrophobic region has been demon- 
strated to be the oligomerization domain for type IV 
pilins. Several observations have strengthened this ini- 
tial hypothesis, including the lack of archaeal genes 
that have detectable homology to genes in bacteria in- 
volved in flagella structure or assembly (29, 79). Per- 
haps the most significant observation is that both 
archaeal flagellins and type IV pilins are made as pre- 
proteins with unusual, atypically short signal peptides 
processed by specific signal peptidase homologs (5, 8, 
9, 24). As the flagella gene loci of more archaea were 
released, it became obvious that besides the flagellin/ 
pilin similarity and the homologous signal peptidases, 
two other proteins were shared in the two systems: an 
ATPase and a poorly described, conserved integral 
membrane protein. 

In addition to the abovementioned similarities, 
the archaeal flagellar filament has been shown to 
share structural similarities to type IV pili, rather than 
to bacterial flagella filaments (23, 115). In bacterial 
flagella assembly, newly synthesized flagellins must 
first travel from the cytoplasm through the hollow, 
growing flagella structure via a specialized type III se- 
cretion system located at the base of the MS ring, be- 
fore incorporating under the filament cap located 
at the distal tip of the structure (62, 63). Distal 
growth would seem to be precluded in archaeal fla- 
gella if there is no central channel for subunits to pass 
through. By default, assembly must occur at the base, 
similar to type IV pili (68, 73), a mechanism that was 
suggested for archaeal flagella several years ago (45). 


PROPOSED MODEL OF ASSEMBLY 
OF ARCHAEAL FLAGELLA 


Even in very early studies of archaeal flagella, the 
unusual nature of these structures led researchers to 
speculate that the likely mechanism of assembly 
would be distinct from the bacterial flagella assem- 
bly mechanism (36). Since then, different laboratories 
have reported similarities of archaeal flagella genes to 
type IV pili genes while homologs to bacterial flagella 
genes remain undetected (14, 23, 84, 115). These ob- 
servations resulted in a proposed mechanism for as- 
sembly of archaeal flagella with its striking prediction 
of incorporation of new subunits at the base (45), as 
in type IV pili (73). The work of Trachtenberg (23, 
115), which indicates, through various electron mi- 
croscopy techniques, that archaeal flagella lack a hol- 
low channel large enough to accommodate the pas- 
sage of flagellins, appears to support a model of 
assembly at the base. Flagellin mutant studies in 


H. salinarum have been interpreted with this mecha- 
nism of incorporation in mind (110). However, it 
must be emphasized that direct proof of polarity of 
growth in the archaeal filaments is still lacking. 

Our current speculative model for assembly of 
archaeal flagella is shown in Fig. 5. Since the total 
number of genes identified in the archaeal flagella sys- 
tem seems very low compared with either bacterial 
flagella or type IV pili systems, it is believed that 
many more genes involved in flagellation remain to be 
identified: for instance, no anchoring or motor com- 
ponents or genes involved in gene regulation are yet 
identified. In the proposed model, preflagellins inter- 
act with chaperones in the cytoplasm, preventing ag- 
gregation through the hydrophobic N termini (87). 
The preproteins are delivered to the cytoplasmic 
membrane where at least three events must occur. The 
signal peptide is removed from the N terminus by the 
preflagellin peptidase FlaK. In most cases, the flagellin 
is modified by attachment of an N-linked glycan. This 
glycan is assembled on a lipid carrier at the cytoplas- 
mic membrane through the activity of various glyco- 
syltransferases. The completed glycan is flipped 
through the membrane and then transferred to an as- 
paragine residue present as part of an N-linked se- 
quon. The oligosaccharyltransferase, STT3 homolog, 
is responsible for this final transfer of the completed 
glycan. This N-linked system is not flagellin-specific 
and in certain archaea is shared with at least the mod- 
ification of S layer proteins. The order of these two 
posttranslational events is unknown, but the two 
processes can occur independently of each other. In 
support of this, a mutant in which flaK was disrupted 
still produced flagellins that migrated as completely 
glycosylated (9), and in mutants disrupted in the gly- 
cosylation pathway, flagellins still had their signal 
peptides removed (21a). Following signal peptide re- 
moval and N-linked glycan addition, the newly syn- 
thesized flagellins are incorporated into the filament. 
Because they are made as preproteins, unlike bacter- 
ial flagellins, we believe it likely that the new subunits 
are added from the cytoplasmic membrane to the 
base of the structure. This could be achieved through 
the combined activity of the FlaHIJ ATPase-con- 
served membrane protein complex that has type IV 
pilus homologs. Strongly supporting this unorthodox 
assembly mechanism is the observation that, unlike 
bacterial flagella, the archaeal flagella (at least in the 
case of H. salinarum) do not have a hollow channel 
for passage of subunits to the distal tip (23). How- 
ever, this assembly mechanism also raises numerous 
fundamental questions. What is the nature of the an- 
choring structure and its assembly in relationship to 
the filament, and what is the role for the subcytoplas- 
mic membrane complex? Is there an instigator or 
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Figure 5. Model of assembly of archaeal flagella. The assembly of the glycan modification is believed to occur independently on 
a lipid carrier in the cytoplasmic membrane (steps 1 to 3 for the three-sugar glycan of M. voltae) before transfer via an SST3 
homolog to an Asn residue within the N-linked sequon present on the flagellins (step 4). Glycosylation, as well as removal of 
the signal peptide by FlaK (steps 5 to 6, although the order of the two steps is unknown), occurs in the cytoplasmic membrane 
prior to incorporation of the flagellins into the filament (step 7). The incorporation step is presumed to involve a FlaHIJ complex 
of ATPases and a conserved membrane protein that has homologs in the type IV pili system. Incorporation of new flagellin 
subunits is hypothesized to occur at the base of the structure since no hollow center for passage of subunits has been observed. 


terminator of flagella growth, and how do archaeal 
flagella regenerate when they are sheared off if 
growth occurs at the base? Since many flagellated ar- 
chaea lack a murein equivalent, to what is the ar- 
chaeal flagellar stator bound? Is this part of the func- 
tion of the subcytoplasmic membrane structure? 
These and many other intriguing questions can only 
be answered by further study. 


CHEMOTAXIS 


Analysis of the many sequenced genomes from 
flagellated archaea indicates the presence of easily 
detected homologs of bacterial genes involved in 
chemotaxis. Indeed, the presence of a highly con- 
served, bacterial-like chemotaxis system interacting 
with the unusual archaeal flagellum is a novel feature 


of the archaeal sensory system. Nonetheless, studies 
of potential archaeal taxis (chemotaxis, aerotaxis, 
phototaxis, thermotaxis, and osmotaxis) are rare and 
often limited to a preliminary demonstration of taxis 
with no follow-up genetic work; no doubt partly due 
to a lack of genetic tools available at that time. 
Capillary assays, from Adler’s classic studies of 
bacterial chemotaxis (1), were adapted for use in 
anaerobic chambers to demonstrate chemotaxis of M. 
hungatei to acetate (70) and M. voltae to acetate, 
isoleucine, and leucine (all requirements for growth) 
but not histidine (97). Natronomonas pharaonis is 
both phototactic and chemotactic (94). H. salinarum 
is chemotactic to all 20 common amino acids (120) 
and compatible osmolytes (56). Storch et al. (103) 
performed a systematic screening of >80 chemicals 
to test for their ability to act as potential attractants 
or repellents in H. salinarum. The strongest attractants 
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included several essential amino acids (leucine, iso- 
leucine, valine, methionine, and arginine) as well as 
the nonessential amino acid cysteine and several pep- 
tides. No evidence for taxis towards sugars was found, 
consistent with the inability of this organism to utilize 
carbohydrates. H. salinarum is also aerotactic, at- 
tracted to orange light while repelled by UV and blue 
light (95, 122) and, most recently, has been shown to 
sense a change in membrane potential (54). 

The study of phototaxis and chemotaxis in 
H. salinarum is a rare instance where significant bio- 
chemical and genetic studies on taxis in an archaeon 
have been reported (64, 93). The purpose of various 
taxis systems (chemotaxis, phototaxis, aerotaxis, etc.) 
in bacteria and archaea is to allow the organisms to 
swim away from harmful conditions and/or toward 
ones favorable for growth and survival. For H. sali- 
narum, maximum growth rate occurs under aerobic 
conditions chemoheterotrophically (41). H. salinarum 
has the ability to use light as an energy source, in ad- 
dition to using aerobic respiration and arginine fer- 
mentation (54). Bacteriorhodopsin (BR) and halo- 
rhodopsin (HR) pump out protons or import chloride 
ions, respectively, setting up an electrochemical gradi- 
ent across the cytoplasmic membrane. Two sensory 
rhodopsins (SRI and SRII) affect swimming migration 
of the halophiles by influencing the flagella motor ro- 
tation in response to various wavelengths of light 
(82). SRI absorbs at ~600 nm and triggers attraction 
to green/orange light but avoidance to harmful UV 
light, while SRII (absorbing light at ~500 nm) results 
in avoidance of blue light (40, 71). The dual role of 
SRI means that cells migrate toward green/orange 
light where the ion pumps will be fully activated but 
not into full spectrum solar radiation containing 
damaging short wavelength photons (41, 101). 

When oxygen and nutrients available for res- 
piration are plentiful, SRI, BR, and HR synthesis in 
H. salinarum are inhibited, but induction of SRII oc- 
curs. The resulting migration of the cells away from 
high luminescence toward the dark mediated by SRII 
presumably occurs to prevent light-induced damage 
to cells. Under conditions of low oxygen and limited 
nutrients, induction of BR, HR, and SRI all occur, 
with the synthesis of SRI resulting in migration of the 
cells toward orange light where BR and HR act to es- 
tablish a proton motive force to drive synthesis of 
ATP (82). 

The presence of a dedicated chemotaxis system 
to detect trimethylammonium compounds (used as 
osmolytes) in H. salinarum may provide an important 
selective advantage in its high-salt environment. How- 
ever, when cells were grown in the presence of such 
compounds, growth and survival were not greatly af- 
fected (56). 


CHEMOTAXIS GENES 


In recent years, it has become obvious that the 
Archaea possess a chemotaxis system that is similar 
to the Bacteria. In most flagellated archaea, analysis 
of completely sequenced genomes reveals the presence 
of a complete set of chemotaxis genes homologous to 
those of bacteria (Table 3) (107). These include pro- 
teins involved in all processes of bacterial chemotaxis, 
namely signal recognition and transduction (various 
methyl-accepting chemotaxis proteins [MCPs] and 
CheD), excitation (CheA, CheW, and CheY), adapta- 
tion (CheR and CheB), and signal removal (i.e., Che Y-P 
dephosphorylation, CheC). Some che genes found only 
in a subset of bacteria (cheV, cheX, and cheZ) have not 
been reported in archaeal genome annotations. To date, 
all che gene clusters have been found within organisms 
belonging to the Euryarcheaota, with no che genes or 
MCPs detected within members of the Crenarchaeota, 
despite the availability of complete genome sequences 
for several flagellated members of the Crenarchaeota. 
Similar to their subcellular localization in many bac- 
teria, MCPs are physically located at the poles in H. 
salinarum, suggesting that this feature has been con- 
served in evolution (37). Since the location of MCPs 
in bacteria may play a role in regulation of chemo- 
taxis, especially in regard to the role of MCP cluster- 
ing in regulating ligand binding and signaling, the 
similar localization of MCPs in at least one archaeon 
could indicate a similar regulation of chemotaxis in 
all bacteria and archaea. 

Unlike the fla operon, che clusters in archaea are 
much more variable in their composition and gene or- 
der. Genes within che clusters are often inverted, pre- 
cluding cotranscription in many cases. With the excep- 
tion of M. maripaludis, cheY, cheB, and cheA tend to 
be clustered together in that order (Fig. 6). Surrounding 
these genes are cheC, cheD, cheR, cheW, and at least 
one MCP. Tandem copies of cheC are observed, and in 
many cases small genes of unknown function are found 
within the cluster. In addition, clusters from Halobac- 
terium and Methanosarcina include several flagellin 
genes. Methanosarcina species, M. acetivorans and 
M. mazei have two distinct, complete che gene clusters 
(designated Cluster 1 and 2). Many other species con- 
tain additional copies of cheA, cheY, and/or cheC and 
rarely cheR or cheW. These genes are located outside 
the che cluster elsewhere on the chromosome. It has 
not been determined whether the additional che genes 
serve functions related to chemotaxis or are involved in 
other processes. Strangely, P. furiosus lacks detectable 
che genes while the genomes of the two other Pyrococ- 
cus species contain a complete set. It is also unusual 
that M. jannaschii lacks che genes, since a complete set 
is found in the mesophilic relative, M. maripaludis. 
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Table 3. 


Distribution of che gene clusters in archaea? 


Organism 


Flagellin 


che genes 


CheA 


CheB 


CheC 


CheD 


CheR 


CheW 


CheY MCP 


Crenarchaeota 
Aeropyrum pernix 
Pyrobaculum aerophilum 
Sulfolobus acidocaldarius 
Sulfolobus solfataricus 
Sulfolobus tokodaii 


Euryarchaeota 
Archaeoglobus fulgidus 


Ferroplasma acidarmanus 
Haloarcula marismortui 


Halobacterium salinarum 


Methanocaldococcus 
jannaschii 
Methanococcoides burtonti 


Methanococcus maripaludis 


Methanopyrus kandleri 


PrRrPON 


11498645 


55378887 
55378897 
55379264 


15790089 


68211075 


45358490 


11498646 


55378896 


15790090 


68211076 


45358489 


11498644 


55378056 
55377404 


15790087 
15790088 
15790567 


68211074 


45358495 
45358494 


11498643 


55378886 


15790086 


68185296 


45358491 


11498642 


55378898 


15790085 


68079876 


45358493 


11498649 


55378265 
55378895 


15790067 
15790092 


68185302 


45358488 


11498647 11498650, 


11498639 
55378888 
55377368 


55379092, 
55377553, 
55377801, 
55377770, 
55379379, 
55378800, 
55376954, 
55378507, 
55376654, 
55377425, 
55379831, 
55379155, 
55379349, 
55377103, 
55379274, 
55377296, 
55379260, 
55377270, 
55377812 
15790682, 
15790681, 
15790508, 
15790077, 
15790413, 
15789966, 
19789953, 
15789618, 
15790664, 
15790685, 
15790497, 
15789962, 
15790756, 
15789818, 
15790609, 
15790124, 
15790447 


15790091 


68185300 68185303, 
68184724, 
68184796 
45358492, 
45357976, 
45358050, 


45358351 


45358496 
45358867 


Continued on following page 
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Table 3. Continued 


che genes 
Organism Flagellin 
CheA CheB CheC CheD CheR CheW CheY MCP 
Methanosarcina 3 20088913 20088914 200888911 20088910 20088912 20088919 20091886 20088918, 
acetivorans 20091884 20091885 20091883 20091882 20091881 20091888 20088915 20091889 
20090837 20091276 
20092349 

Methanosarcina barkeri 1 68134062 68079874 68134061 68079877 68079876 68079923 68079873 68079922, 
68079486 68079922, 
68081123 
Methanosarcina mazei 3 21226430 21226431 21226429 21226428 21226427 21226434 21226432 21226435, 
21227427 21227428 21227425 21227424 21227426 21227432 21227429 21227431, 
21227760 

Methanothermobacter 0 - - - - - - - 

thermautotrophicus 

Nanoarchaeum equitans 0 - - - - - - - 

Picrophilus torridus 0 - - - - - - - - 
Pyrococcus abyssi 3 14521750 14521753 14521750 14521749 14521755 14521757 14521754 14520639, 
14521752 14521751 14521748, 
33356802 14521756, 
14521788, 
14520639 
Pyrococcus furiosus 2 - - - - - - 18976805 
Pyroccoccus horikoshii > 14590396 14590395 14590398 14590399 14590393 14590390 14590394 14590358, 
14590397 14590391, 
14591598, 
14590400, 
14591707 
Thermococcus 5 57640570 57640568 57640571 57640574 57640566 57640564 57640567 57640091, 
kodakaraensis 57640569 57640572 57640573, 
57158895 57642082, 
57158896 57640565, 


Thermoplasma acidophilum 2 - - 
Thermoplasma volcanium 2 - - 


15790447 


“GI numbers are proteins from genome sequences at NCBI Genomes (www.ncbi.nlm.nih.gov/genomes/static/a.html). 


To date, the only direct genetic study of archaeal 
che genes was undertaken by Rudolph and Oesterhelt 
(90) in H. salinarum, an archaeon that possesses a che 
operon with greatest similarity to B. subtilis (54). 
This investigation involved the deletion of the H. sali- 
narum cheA, cheB, cheY, and che] (a gene with sig- 
nificant similarity to cheC) genes. All four genes were 
inactivated and then complemented in different com- 
binations of the three genes. The loss of cheA, cheB, 
or cheY led to the complete loss of chemotaxis and 
phototaxis, whereas the absence of che] resulted in a 
reduction of chemotactic and phototactic ability. The 
genes cheA and cheY were found to be required for 
reverse swimming and counterclockwise rotation of 
the flagella. cheB mutants displayed wild-type distri- 
butions of forward and reverse swimming (50:50), 
while che] mutants were skewed to an 88:12 distrib- 
ution of forward:reverse swimming. Although the ex- 
act role of che] remains unknown, it was suggested to 
play a role in halobacterial signal transduction. 


ARCHAEAL TRANSDUCERS 


The detection of various external signals is ac- 
complished through a variety of receptor proteins. The 
majority of research in this area has been performed 
in extreme halophiles. In H. salinarum, 18 transducer 
genes have been identified in the sequenced genome 
(54, 77). Archaeal transducers belong to three distinct 
families (Fig. 7). Family A consists of bacterial-type 
chemotaxis transducers with periplasmic and cyto- 
plasmic domains interconnected by two transmem- 
brane segments. Family B consists of transducers with 
two or more transmembrane segments and no peri- 
plasmic domain, such as the SRI transducer Htrl. In 
family C, transducers are soluble cytoplasmic pro- 
teins. Transducer proteins are responsible for the de- 
tection of the external signal that is transmitted to the 
internal components of the chemotaxis system. Light 
receptors such as SRI and SRII interact with cognate 
transducer proteins, HtrI and Htrll, respectively. In 
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che clusters in Archaea 
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Figure 6. Chemotaxis gene families in selected archaeal species. Similar shadings indicate homologs shared among families. 


Genes are transcribed in the direction of the arrows. 


H. salinarum, HtrII also acts as a chemotransducer 
for serine via its large periplasmic domain (42) while 
the equivalent protein in N. pharaonis, which lacks 
the periplasmic ligand-binding domain, has no known 
role in chemotaxis (71). In other cases, chemicals can 
be detected through binding proteins such as BasB 
and CosB, which interact with their cognate trans- 
ducer proteins BasT and CosT (55, 56). Other recep- 
tors, such as HemAT (for aerotaxis) (43), and proba- 
bly Car (arginine taxis) (103) and MpcT (for detection 
of changes in membrane potential) (54), also act as 
transducers. Many of these transducers have unusual 
features. For example, Car and HemAT are cytoplas- 
mic transducers while BasT and CosT are both asso- 
ciated with membrane-attached binding proteins 
(BasB and CosB) that have a role in chemotaxis but 
not in transport. The binding proteins have features 
characteristic of lipoproteins, including the LIPO box 
at the N terminus (56). MpcT is the first transducer 
identified that responds to a change in the membrane 


potential, and HemAT is the first myoglobin-like 
heme-containing protein reported in the Archaea. 

Co-localization of receptor and transducer genes 
on the chromosome may be a general principle in ar- 
chaea (56). This is observed with the basB/T and 
cosB/T systems where the substrate-binding and trans- 
ducer genes are organized in operon-like structures 
in the genome. Furthermore, the genes for SRI and 
SRII are apparently transcribed with their respective 
transducer partners, HtrI and Htrll. 


Phototaxis Transducers 


In H. salinarum, the photoreceptors responsible 
for phototactic behavior are SRI and SRII. SRI is 
well characterized for its role in H. salinarum’s repel- 
lant response to blue light (66). SRI, on the other 
hand, mediates both an attractant response to orange 
light and a repellant response to UV light. 
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Figure 7. Archaeal transducer families involved in taxis. Similar to bacterial MCPs, the cytoplasmic domain contains conserved 
modules involved in methylation/demethylation for signal adaptation as well as for signal transmission to the histidine ki- 
nase, CheA. Family A transducers are similar to bacterial chemotaxis transducers while Family B transducers lack the periplas- 
mic domain. Family C transducers are soluble. Aerotaxis transducer Htr VIII is unusual in having six transmembrane helices 
rather than two. Depiction of HtrlI is from H. salinarum. N. pharaonis HtrlI lacks the periplasmic domain. Figure based on 
data presented in Hoff et al. (41) and Marwan and Oesterhelt (64). 


The halobacterial transducers for phototaxis are 
homologous in sequence, structure, and function to 
the bacterial methyl-accepting chemotaxis proteins. 
However, they lack an extracellular ligand-binding 
domain and are physically attached to the rhodopsins. 
Similar to bacterial transducers (methyl-accepting 
chemotaxis proteins), HtrI contains two transmem- 
brane helices, a strongly conserved cytoplasmic re- 
gion involved in binding of a histidine kinase, and 
flanking regions containing carboxylmethylation 
sites. Crystallization of the receptor-transducer com- 
plex has been achieved using a shortened transducer 
(residues 1 to 114, N. pharaonis HtrII) comprising 
the two transmembrane helices (TM1 and TM2) and 
an additional small cytoplasmic fragment (39). One 
intriguing aspect of sensory rhodopsin is that, in its 
transducer-free state, it acts as a light-driven proton 
pump. Interaction with transducer interrupts the 
transport cycle through closure of the cytoplasmic 
half-channel. Through the generation of a series of 
mutants truncated in the membrane-proximal region 
of an SRI-HtrlI fusion protein, it was demonstrated 
that residues 62 to 66 were nearly completely respon- 
sible for channel closure by converting transducer- 
free rhodopsin from a proton pump into a signal- 


relaying device (22). These findings were consistent 
with fluorescent probe-labeling and cysteine cross- 
linking studies, which revealed that the five residues 
were within energy-transferring distances (119). Site- 
directed mutagenesis of these five residues (K62A, 
E63A, 164A, A65S, and A66S) demonstrated that 
channel closure requires the presence of amino acids 
at these five positions, but surprisingly, closure does 
not depend on a specific amino acid type (22). 


Chemotaxis Transducers 


Several chemotactic transducers were identified 
in H. salinarum by Western-blot analysis with antisera 
to bacterial transducers (2). Using a probe specific to 
a conserved region in transducers, 13 putative trans- 
ducers were subsequently identified in H. salinarum 
(122). With the completion of the H. salinarum ge- 
nome sequence, as many as 18 putative transducers 
were identified (54, 77). They share highly conserved 
C-terminal regions that are likely to be involved in 
signaling to the histidine kinase and methylation- 
dependent adaptation. Although different transducers 
have been characterized to varying degrees (see be- 
low), many others remain uncharacterized. 
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The phototransducer HtrlI of H. salinarum is a 
member of transducer family A. One fundamental dif- 
ference between the phototransducers HtrI and Htrll 
is that HtrIl is more like its bacterial counterpart and 
contains a large periplasmic domain at the N termi- 
nus. Subsequent functional analysis using HtrII dele- 
tion and overexpression strains confirmed the hypoth- 
esis that, in addition to being a phototransducer, HtrII 
also functions as a chemotransducer (42). Deletion 
mutants are nonchemotactic toward serine and are de- 
void of methyltransferase activities following serine 
chemostimulation, while responses to aspartic acid 
and glutamic acid remain comparable to wild type. 

Several potential halobacterial transducer genes 
were inactivated in H. salinarum, and the mutants 
were tested against a wide range of compounds. This 
analysis led to the identification of Car, a specific 
transducer for arginine (103). Car is rare in that it is a 
soluble, cytoplasmic transducer (rather than a cyto- 
plasmic membrane-located transducer) that is able to 
be methylated. Its function is to sense the arginine 
level in response to arginine:ornithine antiporter ac- 
tivity, through either direct binding or interaction 
with an unknown, cytoplasmic receptor component. 
Another halobacterial cytoplasmic transducer is 
HtrXI from halobacterial strain Flx15 (20). HtrXI 
differs from Car at only 9 amino acid residues. An 
btrXI deletional mutant is defective in chemotaxis to- 
ward glutamic acid and aspartic acid and devoid of 
methyltransferase activity. It is not known whether 
HtrXI interacts with a membrane protein or directly 
binds to its intracellular chemoeffector. 

Analysis of the H. salinarum genome led to the 
identification of binding protein orthologs immedi- 
ately upstream of potential transducers (BasB/BasT 
and CosB/CosT pairs) (56). These binding proteins 
are membrane bound and are presumably attached to 
the archaeal membrane through lipid anchors. No 
promoter-like elements were identifiable upstream of 
basT and cosT; Northern blot analysis confirmed that 
basB/basT belong to a single transcriptional unit. 
Functional analysis of deletion mutants revealed that 
BasB/BasT mediates chemotaxis behavior toward 
branched-chain amino acids (leucine, valine, isoleu- 
cine, methionine, and cysteine), while CosB/CosT me- 
diates chemotaxis toward compatible solutes. Cells 
carrying a truncated BasT transducer were found to 
be nonchemotactic toward BasT-specific solutes, 
thereby demonstrating a function for the extracellular 
signaling domain. The homologous binding proteins 
in bacteria serve the dual function of mediating 
chemotactic responses and initiating solute uptake 
through interaction with the ABC transport system. 
However, since the halobacterial genome does not en- 
code genes for an equivalent ABC transporter system 


for branched-chain amino acids, in H. salinarum both 
BatB and CosB have roles exclusively limited to 
chemotaxis and not to actual transport of substrates 
(56). It has been speculated that the chromosomal co- 
localization of the substrate-binding and transducer 
genes may be due to a need for precise regulation of 
the two modules (56). 


Aerotaxis Transducers 


The H. salinarum aerotactic response was ini- 
tially observed as motile cells accumulating around 
an air bubble (16, 102). The first aerotaxis transducer 
was characterized by Brooun and colleagues (19). 
The htrVIII gene was originally cloned as one of 
13 putative transducers with homology to methyl-ac- 
cepting proteins (122). HtrVIII was speculated to 
have a role in oxygen sensing, based on its six trans- 
membrane segments being homologous to the heme- 
binding sites of the eucaryal cytochrome c oxidase 
(19). The HtrVIII C-terminal domain shares homol- 
ogy with the bacterial methyl-accepting chemotaxis 
protein. Aerotaxis was demonstrated through mi- 
croslide capillary assays; an htr VII deletion strain 
lost aerotactic response while an overexpression 
strain exhibited enhanced aerotaxis. The aerotactic 
behavior was found to be methylation dependent. 

HemAT-Hs (previously HtrX) is another newly 
identified myoglobin-like, heme-containing aerotactic 
transducer in H. salinarum. It possesses N-terminal 
homology to myoglobin, as well as similarity to the 
cytoplasmic signaling domain of Tsr (E. coli MCP). 
Analysis of deletion and overexpression strains con- 
firmed that HemAT-Hs is responsible for the aero- 
phobic response in H. salinarum (43). The signaling is 
a methylation-dependent process, elicited by the two 
methylation sites at the K1 region of the C terminus. 
HemAT-Hs is a soluble transducer that appears to 
function by binding diatomic oxygen at its heme 
when the heme is in the ferrous state, with the oxygen 
binding triggering a conformational change in the 
N-terminal sensor domain that alters the C-terminal 
signaling domain. The similarity of the C-terminal 
signaling domain to the MCP family of bacterial 
chemoreceptors implies that the subsequent signal 
transduction events leading to aerotaxis in H. sali- 
narum are likely to be similar to those involved in 
chemotaxis in bacteria. 


Transducer Responding to Membrane Potential 


Recently, MpcT (formerly Htr14) was identified 
in H. salinarum as the transducer responsible for 
communicating changes in the membrane potential to 
the taxis system (54). In H. salinarum mutants con- 


404 JARRELL ET AL. 


taining only one of the light-driven ion pumps (bac- 
teriorhodpsin or halorhodopsin) as their sole retinal 
protein, a decrease in irradiance causes a phototactic 
response in the absence of respiration. The decrease 
in the proton motive force (PMF) is detected, and the 
cell responds with a reversal of flagellum rotation. By 
systematically knocking out each of the 18 Htr- 
encoding genes, htr14 was identified as the putative 
PMF sensor and was the only Htr gene essential for 
the photoresponse in these cells. Responsiveness 
was restored by complementation with htr14. Based 
on calculations of the cytoplasmic buffering capac- 
ity, it was determined that MpcT was responding to 
changes in the membrane potential component of the 
PMF and not the change in internal pH. 


GENERAL MECHANISM 
OF TAXIS IN ARCHAEA 


Extreme halophiles remain the only archaea in 
which significant chemotaxis/phototaxis studies have 
been reported. Similar to bacteria, in the absence of 
a gradient of stimuli, halobacteria exhibit a random 
walk as cells intermittently change the direction of 
their swimming (81). The presence of a gradient of at- 
tractant results in suppression of spontaneous motor 
switching, and as a result extends swimming, while 
increasing the level of repellents leads to activated 
switching and a resultant change in direction. Work 
by Oesterhelt’s group on H. salinarum established that 
archaea possess an chemotaxis system analogous to 
bacteria (90-92). Moreover, the identification and 
function of various transducers have been reported 
(19, 43, 54, 56, 71, 103). It appears that once the 
stimulus has been detected, a bacterial-like response 
is triggered (Fig. 8). As in bacteria, the heart of the 
chemotaxis mechanism in archaea lies in a two-com- 
ponent regulatory system in which the ligand occu- 
pancy state of the receptors controls autophosphory- 
lation of the histidine kinase, CheA. CheA is inhibited 
by attractants in most bacteria, such as E. coli. B. sub- 
tilis is an exception where attractants stimulate CheA 
activity. Similar to most bacteria, chemical and light at- 
tractants decrease CheA activity in archaea (107). 

Because many chemoreceptors remain unidenti- 
fied, the first step in transmembrane signaling is not 
very well characterized. Limited information has come 
from studying the N. pharaonis SRII transducer com- 
plex. The four retinal proteins, BR, Hr, SRI, and SRI, 
share a common topology with membrane helices 
arranged in two arcs (80). The inner arc is composed 
of helices B, C, and D, while the outer arc comprises 
helices A, E, F and G. A transmembrane pore is 
formed mainly between helices B, C, F, and G, and the 


retinal is attached to a lysine residue in helix G. Us- 
ing the crystal structure of the N. pharaonis receptor- 
transducer complex as a model, a working model has 
been proposed in which light excitation triggers an 
outward movement of the cytoplasmic part of helix 
F, which in turn induces a rotation of one of the trans- 
membrane helices of the transducer, thereby triggering 
it into active state (39). It has also been suggested that 
as SRII-HtrlI of H. salinarum displays dual function- 
ality (in both phototaxis and chemotaxis; see “Photo- 
taxis transducers and Chemotaxis transducers,” 
above), the transmembrane-signaling mechanism in- 
volved in both types of taxis is also likely shared. 
Interaction of the MCP-like transducers initially 
occurs with CheW, and the signal is passed to CheA. 
Following autophosphorylation, CheA transfers the 
phosphate to CheY. Interaction of the phosphorylated 
CheY (CheY-P) with a switch component of the ar- 
chaeal flagellum is assumed to then occur, resulting in 
a change in the rotation of the flagellum. In the bac- 
terial system, CheY binds to the switch component, 
FliM (18), but a homolog or functionally equivalent 
protein has not been identified in any archaea (29, 
79). Hence, the necessary connection of the sensing 
system to the motility system remains an enigma. For 
phototaxis in H. salinarum at least and presumably 
for other sensory systems in addition to the two- 
component signaling system, there is also signaling 
through fumarate. Both CheY and fumarate are re- 
quired for switching of flagella rotation in phototaxis; 
a similar role for fumarate as a switch factor was later 
shown in E. coli. Activated transducer causes the re- 
lease of fumarate from a fumarate-binding protein lo- 
cated in the cytoplasmic membrane (64, 65, 67). As 
many as 60,000 molecules of fumarate may be re- 
leased per second in a cell responding to light stimu- 
lus. Evidence suggests that fumarate and CheY act co- 
operatively at the level of the flagellar switch (64). 
Models for the mechanism of various taxes must be 
able to integrate not only data on detection of exter- 
nal stimuli but also more recent data that clearly 
show cells engaging in fumarate-mediated signaling 
as well as sensing changes in membrane potential and 
intracellular arginine levels. These observations 
clearly suggest that, in addition to changing external 
stimuli, the metabolic state of a bacterial or archaeal 
cell is important in regulating tactic behavior (64). 
For the cells to detect changes in the signal in- 
tensity over a wide range of signal strength (light in- 
tensity or chemical concentration), flagellated archaea 
are equipped with an adaptation mechanism. Similar 
to bacterial MCPs, archaeal transducers contain glu- 
tamate residues in conserved regions of their cyto- 
plasmic domains that can be methylated. These res- 
idues are substrates for the methylesterase CheB and 
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Figure 8. (See the separate color insert for the color version of this illustration.) Overview of halobacterial signal transduc- 
tion. Transducer proteins (Htr proteins) are depicted as dimers (brown) and shown in their expected topology. The Htr re- 
gions involved in adaptation (yellow) and in signal relay (dark gray) to the flagellar motor via various Che proteins are indi- 
cated. The actions of the Che-protein machinery are illustrated for only one of the Htr proteins, shown on the left, for which 
an interaction with a substrate-loaded, membrane-anchored binding protein is indicated. CheD and CheJ (CheC) proteins are 
omitted for clarity. Htr1 and Htr2 transduce light signals via direct interaction with their corresponding receptors SRI and SRII. 
Repellent light signals mediated by SRI and SRII elicit the release of switch factor fumarate from a membrane-bound fu- 
marate pool. MpcT senses changes in membrane potential (AW) generated via light-dependent changes in ion transport activ- 
ity of BR and HR. The relative sizes of receptors, binding proteins, transducers, and Che proteins approximately reflect their 
corresponding molecular masses. Reproduced from Molecular Microbiology (54) with permission of the publisher and 


D. Oesterhelt. 


the methyltransferase CheR. In the HtrI and Htrll 
sequences in H. salinarum, Perazzona and Spudich 
(85) identified three glutamate pairs containing po- 
tential methylation sites. Mutagenic substitution of 
these glutamate pairs by alanine subsequently identi- 
fied Glu265-Glu266 (HtrI) and Glu513-Glu514 (HtrII) 
as being responsible for signal transduction. Further- 
more, it was revealed that H. salinarum adaptation to 
chemostimuli is different from the E. coli chemotaxis 
paradigm but is instead similar to the B. subtilis case, 
in which turnover rates increase regardless of the na- 
ture of the stimulus. The methylation state of the Htr 
proteins results in adaptation to signals, or a resetting 
of the system so that cells can continue to respond 
to changes in the signal rather than absolute signal 
strength. Thus it appears that the same mode of adap- 
tation to stimulus occurs in archaea as in bacteria. 
CheY-P determines the direction of flagella rota- 
tion: in E. coli it causes CW rotation and in B. subtilis 


CCW rotation. When cheY or cheA was deleted in 
H. salinarum, no reversals in swimming were seen 
and cells showed a preferential forward swimming 
due to CW flagella rotation (90). Thus CheY-P is re- 
sponsible for the reversal of rotation, i.e., CCW. The 
default direction of flagella rotation in H. salinarum 
is CW, and CheY-P is required to cause rotation in 
the CCW direction (107). Since quick responses to 
changing signals in the environment are a necessary 
part of the chemotaxis system, the half-life of the re- 
sponse regulator, CheY-P, is usually very short (<1 
min for various bacterial Che Y-P); dephosphorylation 
is often aided by specific phosphatases. While the 
best-studied phosphatase is CheZ in E. coli, most 
chemotactic bacteria and archaea lack the gene for 
this protein (117). In B. subtilis, Che Y-P dephospho- 
rylation is carried out by a combination of FIiY, 
CheC, and possibly CheX (106). FIiY is particularly 
interesting as it is a component of the flagellar switch 
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in Bacillus. In archaea, the candidate for CheY-P de- 
phosphorylation appears to be CheC, which is pre- 
sent in all chemotactic archaea (107). Since many ar- 
chaea appear to contain multiple copies of cheY-like 
genes, it may be that some archaea use alternate ver- 
sions of CheY to act as phosphate sinks in signal re- 
moval from CheY-P, as is the case in Sinorhizobium 
meliloti (98). 


PERSPECTIVE: THE NEXT FIVE YEARS 


The study of flagella, type IV pili, and chemo- 
taxis in bacteria has generated a wealth of knowledge 
on a variety of fundamental concepts, including gene 
regulation, protein export, and protein-protein inter- 
actions, that far supersede the limited direct appli- 
cation to bacterial and archaeal motility. Continued 
study of archaeal flagellation and chemotaxis is ex- 
pected to yield the same far-ranging information 
about the much less well-studied archaea. In the next 
five years, it is predicted that significant advances 
will occur to answer fundamental questions about 
how the archaeal flagella is assembled, its compo- 
nent proteins, and its interaction with the chemo- 
taxis system. 

New genes are slowly being identified that are in- 
volved with flagellin posttranslational modification, 
i.e., the processing peptidase and glycosylation genes. 
The latter also play a role in modification of the 
S layer. Genes involved in posttranslational modifi- 
cation may be unique to each organism, and many of 
these are yet to be identified. Structural studies of the 
glycans are a necessary prerequisite for seeking out 
the required genes and identifying the precise roles of 
the gene products in glycan composition, assembly, 
and attachment. However, additional flagella struc- 
tural genes still need to be identified. These must in- 
clude genes for anchoring proteins (basal body equiv- 
alents), mot genes (needed for flagella motor function, 
i.e., rotation of a completed flagellum) and switch 
genes (needed for torque generation and reversal of 
flagella rotation). None of these have yet been identi- 
fied by homology searches based on their bacterial 
counterparts and so other methods must be used. 
Methods may include random mutagenesis by inser- 
tion mutagenesis using cloned genomic fragments, or 
possibly transposon mutagenesis. These studies will 
be facilitated by the methodological development of 
systems for genetic manipulation of archaea, particu- 
larly for methanogens (53, 72, 61, 121; see Chapter 
13). Significant progress in understanding motility, 
flagellation, and chemotaxis will occur as the genetic 
tools continue to improve in the various model or- 
ganisms. Study of the flagella-associated genes which 


are often cotranscribed with flagellins will, it is hoped, 
yield important information about their, so far com- 
pletely unknown, role in archaeal flagellation. 

Another issue that, it is hoped, will be resolved in 
the next 5 years is proof of the polarity of growth of 
archaeal flagella. While similarities of archaea flagella 
to type IV pili have been described, these are limited 
to the flagellins, associated peptidase, ATPase, and 
one conserved membrane protein. These similarities, 
especially the signal peptide-containing flagellins and 
peptidase, led to the proposal many years ago that the 
incorporation of the new subunits would be at the 
base rather than the distal tip, as is the case with bac- 
terial flagellins (45). The lack of a detectable central 
channel in archaeal flagella that was reported recently 
strongly supports the hypothesized assembly mecha- 
nism (23). However, evidence that clearly demon- 
strates polarity is still desirable and necessary. How 
a rotary apparatus would be assembled from the base 
and the nature of the anchoring structure for this 
organelle are important protein secretion/assembly 
questions that need answering. 

The connection between the motility apparatus 
and the environmental sensing system needs to be 
identified. Identification of the switch protein that 
binds CheY-P would be a big step forward since the 
gene for such a protein may be located in an operon 
of other flagella structural protein genes. In the same 
vein, the site of interaction of fumarate at the flagella 
motor should be identified. Continued deletion analy- 
sis of the 18 homologs of the MCPs in H. salinarum 
(54, 103) will undoubtably lead to the identification 
of the role each of them has in sensing the environ- 
ment in this extreme halophile. This will allow for a 
complete picture and integration of various types of 
taxis in one archaeon. 

Finally, the structure of the flagellum itself has 
only been tackled recently by Tractenberg’s group 
(23, 115). The properties of the filament proteins and 
how they interact, especially in the most extreme 
habitats of hyperthermophily and saturated salt, to 
create such stable structures is so far largely unex- 
plored and ripe for analysis. 

It is expected that the continued study of ar- 
chaeal flagellation and chemotaxis will lead to novel 
discoveries about these structures and processes in ar- 
chaea, and these may in turn lead to insights into the 
understanding of bacterial chemotaxis and type IV 
pili assembly, structure, and function (25). 
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Chapter 19 


Structure and Evolution of Genomes 


PATRICK FORTERRE, YVAN ZIVANOVIC, AND SIMONETTA GRIBALDO 


INTRODUCTION 


At the archaeal meeting held in Munich in 1993 
for the official retirement of Professor Wolfram Zillig 
(who continued working and publishing until 2004) a 
small informal gathering was organized by Roger 
Garrett to discuss the possibility of sequencing an ar- 
chaeal genome (two years before TIGR sequenced the 
first bacterial genome). The main topic discussed was: 
which archaeal genome should we sequence? Everyone 
in the room started to defend his (her) favorite or- 
ganism (Sulfolobus or Halobacterium, why not a me- 
thanogen?). After some discussion, Carl Woese asked 
to speak, and everybody wondered what would be his 
preferred archaeon. “We need to sequence at least five 
archaeal genomes!” were his words. This was non- 
sense for most participants, for whom it would have 
been already good to have at least one, besides the hu- 
man genome! Ten years later, twenty archaeal genomes 
were indeed available in public databases. So Carl was 
right after all, as it was for the realization that the 
Archaea are not simply “strange bacteria,” but a do- 
main of life on their own (see Chapter 1). 

The sequencing of archaeal genomes has been 
critical for archaeal research, especially because of the 
limited availability of easy-to-handle genetic systems 
(see Chapter 21). Moreover, it has permitted the ex- 
istence of Archaea to be advertized outside the im- 
mediate fan club. After genome data became avail- 
able, people who had never considered working with 
such exotic microbes (“God, we have to grow them!”) 
could jump onto the bandwagon by just ordering 
DNA and cloning it into Escherichia coli. The com- 
plete sequence of the genome of Methanococcus jan- 
naschii was made available by TIGR (17) only one 
year after the historical Haemophilus influenzae pub- 
lication. Since then, a large number of proteins from 


M. jannaschii have been cloned and sequenced by peo- 
ple who would have never considered dealing with a 
methanogen before. With the yeast genome becoming 
accessible the same year (1996), the sequencing of the 
M. jannaschii genome immediately opened up Pan- 
dora’s box for a completely new research field: large- 
scale comparative genomics. 

The first eight sequenced archaeal genomes were 
all from thermophiles and hyperthermophiles. This 
bias toward high temperatures was the product of both 
an academic interest (biologists have been fascinated 
by hyperthermophiles since their discovery in the early 
eighties) and of funding necessity (thanks to PCR and 
Taq polymerase it was easier in the 1990s to ask for 
money for the genome sequencing of a hyperther- 
mophile). More recently, other types of extremophiles 
(halophiles, thermoacidophiles, psychrophiles) have 
entered the pipeline. Eventually, the genomes of several 
mesophilic archaea have appeared, and more exciting 
ones from previously uncultivated archaea are to come 
in the near future. 

The availability of archaeal genomes from organ- 
isms living in such diverse environments has provided 
a unique opportunity to look at how DNA sequences 
are shaped by environmental factors. However, and 
primarily, the sequencing of archaeal genomes has al- 
lowed the concept of the three domains to be tested. In 
fact, although the existence of three distinct domains 
of life was already firmly established through rRNA 
sequence comparison, and supported by the discov- 
ery of unique features in archaeal biochemistry and 
molecular biology (see Chapter 2), it remained impor- 
tant to determine how the uniqueness of Archaea 
translated at the whole-genome level. Not so surpris- 
ingly, some people have found the resemblance of the 
structure and organization of archaeal and bacterial 
genomes to be a temptation to get back to the old 
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“prokaryote/eukaryote” dichotomy. However, the 
three-domains concept is not based on phenotypic 
traits, but relies on the existence of three different se- 
quence spaces for the macromolecules that are central 
to gene expression and reproduction in the living 
world. This trichotomy was nicely reemphasized by 
comparative genomics, especially when several archaeal 
genomes became available. The vast majority of genes 
in archaeal genomes have indeed their most closely 
related homologs in other archaeal genomes, rather 
than bacterial or eucaryal genomes. A general princi- 
ple that has emerged is that all archaeal housekeeping 
proteins (especially those for information processing) 
cluster together in phylogenetic trees, distinct from 
bacterial and eucaryal homologs (when they exist). 
Hence, they correspond to archaeal versions (sensu 
Woese), in the same way that informational proteins 
from bacteria cluster together and eucaryal proteins 
cluster together. 

Archaeal genomics has also confirmed and ex- 
tended the evolutionary linkage between Archaea and 
Eucarya that was previously uncovered by the pio- 
neering work of Wolfram Zillig and coworkers on ar- 
chaeal RNA polymerases. This was especially striking 
for DNA replication, since the majority of bacterial 
DNA replication genes have no homologs in Archaea, 
whereas most essential eucaryal DNA replication pro- 
teins have closely related archaeal homologs. The ex- 
istence of many informational proteins that are com- 
mon to Archaea and Eucarya and absent from Bacteria 
is not only true for DNA replication, but also for 
translation (especially the initiation step), transcrip- 
tion, and RNA and protein processing (i.e., informa- 
tional proteins). This distribution is well illustrated by 
the phylogenetic affinities of the core proteins com- 
mon to all archaeal genomes. This includes 118 pro- 
teins with homologs in both Eucarya and Bacteria; 
of these, 63 have only eucaryal homologs while just 
13 have only bacterial homologs (74). If the core of 
archaeal genomes has a profound eucaryal flavor, its 
periphery (the proteins only present in one or a few ar- 
chaeal genomes) has a rather bacterial affinity. In fact, 
archaeal genomes as a whole always encode more bac- 
terial-like than eucaryal-like proteins. However, most 
of these are operational proteins (such as proteins in- 
volved in metabolism or transport) that show a long 
history of horizontal gene transfer (HGT) between ar- 
chaea and bacteria. 

Many reviews have summarized insight from 
newly sequenced archaeal genomes and have usually 
focused on aspects of comparative genomics (35, 73, 
74, 91). This chapter focuses on the description of 
archaeal genomes and what can be learned from ar- 
chaeal genomics about the mechanisms of genome 
evolution, and the history of the Archaea domain itself. 


GENERAL FEATURES 
OF ARCHAEAL GENOMES 


Compared with Bacteria (231 completely se- 
quenced genomes and 413 in progress at the end of 
August 2005), the number of completely sequenced 
archaeal genomes is much smaller. Currently, only 
24 archaeal genome sequences are available (of which 
only five are for Crenarchaeota) (Table 1) and 25 are 
in progress (of which eleven are for Crenarchaeota) 
(Table 2). 

All archaeal genomes sequenced to date are cir- 
cular and relatively compact, with sizes ranging from 
490 kb for Nanoarchaeum equitans (the smallest cel- 
lular genome ever sequenced) (113), up to the 5.75 Mb 
for Methanosarcina acetivorans (42). The larger 
genomes are from mesophilic archaea and contain a 
high proportion of genes recruited from bacteria by 
HGT (up to a third in Methanosarcina mazei [31]). 
Asa rule, genomes from hyperthermophiles appear to 
be smaller than genomes from mesophiles. However, 
this might be biased, since most sequenced genomes 
of mesophilic archaea belong to species that are quite 
distantly related from hyperthermophiles. Indeed, the 
genomes from the closely related Methanococcus 
maripaludis (mesophilic) and M. jannaschii (hyper- 
thermophilic) have similar sizes (1.6 Mb and 1.74 Mb, 
respectively) (17, 48). 

The average gene density is about one gene per 
kilobase, with usually short intergenic regions. Simi- 
lar to bacteria, gene density slightly decreases and the 
length of intergenic regions increases with genome 
size (42). Generally speaking, little work has been 
done to efficiently mine intergenic regions of archaeal 
genomes, and more systematic and exhaustive ap- 
proaches should be very fruitful for retrieving new, 
important information (e.g., RNA genes, regulatory 
signals, and so forth). In addition to the main chromo- 
some, several archaeal genome sequences include one 
or several plasmids of varying sizes (from 3444 bp of 
Pyrococcus abyssi pGTS up to 410,554 bp of Haloar- 
cula marismortui). 

Many archaeal genes are grouped in clusters of 
various sizes, some of them representing bona fide 
operons (i.e., two or more genes translated from a sin- 
gle transcript and under the regulation of a repressor 
and an inducer). The clustering of archaeal genes is 
particularly useful to make functional predictions (55, 
73, 81), since genes encoding proteins that interact 
with each other (especially subunits of multiprotein 
complexes) or participate in a similar molecular mech- 
anism (functional operons) are often clustered in ar- 
chaeal genomes. For example, the conserved genomic 
localization allowed the detection and biochemical char- 
acterization of specific archaeal helicases and nucleases 
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Table 1. Complete and in-progress archaeal genomes in public databases* 


Organism name Group Size #chr? #pl° GC% _ Inteins? Released 
Aeropyrum pernix K1 Crenarchaeota 1.67 1 0 67 1 06/26/1999 
Archaeoglobus fulgidus DSM 4304 Euryarchaeota 2.18 1 0 46 0 12/06/1997 
Haloarcula marismortui ATCC 43049 Euryarchaeota 4.27 1 7 61.1 4 11/03/2004 
Halobacterium sp. NRC-1 Euryarchaeota 2.57 1 2 65.9 1 10/01/2000 
Methanocaldococcus jannaschii DSM 2661 Euryarchaeota 1.74 1 3 31.3 19 08/23/1996 
Methanococcus maripaludis $2 Euryarchaeota 1.66 1 0 33.1 0 10/07/2004 
Methanopyrus kandleri AV19 Euryarchaeota 1.69 1 0 60 5 04/04/2002 
Methanosarcina acetivorans C2A Euryarchaeota 5.75 1 0 42.7 0 04/05/2002 
Methanosarcina barkeri strain fusaro Euryarchaeota 4.87 1 1 39.2 0 06/26/2002 
Methanosarcina mazei Gol Euryarchaeota 4.1 1 0 41.5 0 07/20/2002 
Methanothermobacter thermautotrophicus Euryarchaeota 1.75 1 0 49.5 0 11/26/1997 
strain Delta H 
Nanoarchaeum equitans Kin4d-M Nanoarchaeota 0.49 1 0 31.6 1 10/21/2003 
Natronomonas pharaonis DSM 2160 Euryarchaeota 2.75 1 2 63.1 0 09/28/2005 
Picrophilus torridus DSM 9790 Euryarchaeota 1.55 1 0 36 1 06/09/2004 
Pyrobaculum aerophilum strain IM2 Crenarchaeota 2.22 1 0 52 0 01/17/2002 
Pyrococcus abyssi GES Euryarchaeota 1.77 1 1 42 14 06/01/2001 
Pyrococcus furiosus DSM 3638 Euryarchaeota 1.91 1 0 42 10 08/03/1999 
Pyrococcus horikoshii OT3 Euryarchaeota 1.74 1 0 42 14 07/29/1998 
Sulfolobus acidocaldarius DSM 639 Crenarchaeota 2.23 1 0 36.7 0 07/05/2005 
Sulfolobus solfataricus P2 Crenarchaeota 2.99 1 0 35.8 0 06/28/2001 
Sulfolobus tokodaii strain 7 Crenarchaeota 2.69 1 0 32.8 0 09/27/2001 
Thermococcus kodakaraensis KOD1 Euryarchaeota 2.09 1 0 52.0 8 01/13/2005 
Thermoplasma acidophilum DSM 1728 Euryarchaeota 1.56 1 0 50 1 10/12/2000 
Thermoplasma volcanium GSS1 Euryarchaeota 1.58 1 0 50 1 12/20/2000 


“Data adapted from http://www.ncbi.nlm.nih.gov/genomes/|proks.cgi as of 11 October 2005. 


’Number of chromosomes. 
“Number of plasmids. 


4From (http://bioinfo.weizmann.ac.il/~pietro/inteins/Inteins_table.html June 2004) and SWISSPROT release 48.2 of 11 October 2005 (http://www.expasy 


.org/cgi-bin/lists?intein.txt). 


that are coexpressed with the universal DNA recombi- 
nation proteins, Rad50 and Mre11 (27). A few gene 
clusters are conserved in most archaeal genomes, such 
as the superoperon of ribosomal proteins, the RNA 
polymerase operon, or the Mre11 cluster. However, 
most clusters tend to be conserved only in a subset of 
genomes, indicating that their association is constantly 
challenged by genome disruption (see “Mechanisms of 
genome evolution,” below), and subsequently selected 
for in different gene arrangements for functional rea- 
sons. As a rule, gene clusters, including operons, tend 
to be less abundant in small genomes; they are very rare 
in the genome of N. equitans (74, 113). This suggests 
that the genomes have been streamlined by rearrange- 
ments leading to gene losses and cluster disruption. 
Similar to bacteria, some archaeal rRNA and 
tRNA genes contain introns, and some protein-coding 
genes are interrupted by inteins. These mobile ele- 
ments, especially inteins, appear to be more abundant 
in archaea than in bacteria (see Chapters 5 and 22). 
Inteins are especially abundant in Thermococcales 
and in M. jannaschii, but they are rare or even absent 
in other archaea (Table 1) (95). There are no introns in 
archaeal protein-coding genes, consistent with the ab- 
sence of genes encoding homologs of eucaryal spliceo- 


somes. In addition to rRNA and tRNA genes that are 
present in all cellular organisms, archaeal genomes 
contain a plethora of eucaryal-like noncoding RNAs 
(snoRNAs) that are involved in rRNA processing, 
and probably many other types of RNA genes that 
are yet to be analyzed (92) (see Chapter 7). 

The analysis of complete archaeal genome se- 
quences was especially important for identifying some 
major features of the mechanism of translation initi- 
ation in archaea (62) (see Chapter 8). Preliminary 
analyses of a few archaeal genes led to the conclusion 
that archaea, like bacteria, mainly use Shine-Dalgarno 
(SD) sequences for the recognition of mRNAs by the 
ribosome. However, this has not been confirmed by 
analyses at the whole-genome level (62). For exam- 
ple, mRNA with Shine-Dalgarno sequences are less 
abundant than “leaderless” mRNAs in Sulfolobus 
acidocaldarius (23), and apparently, completely absent 
in Pyrobaculum aerophilum (33). Archaeal genomes 
encode homologs of all the eucaryal genes that are 
involved in translation initiation (a few of them being 
in all three domains of life), except for the factor that 
recognizes the cap structure present at the 5’ end of 
most eucaryal mRNA (a structure that is absent in 
archaea). Analysis of archaeal genomes also confirmed 
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Table 2. Archaeal genomes in progress in public databases? 


Organism name Group Size? GC% 
Acidianus brierleyi Crenarchaeota 1.8 - 
Caldivirga maquilingensis Crenarchaeota - 43 
Cenarchaeum symbiosum Crenarchaeota - - 
Ferroplasma acidarmanus Fer1 Euryarchaeota 1.87 36.8 
Halobacterium salinarum Euryarchaeota 4 - 
Halobaculum gomorrense Euryarchaeota - 70 
Haloferax volcanii DS2 Euryarchaeota 4.03 - 
Halorubrum lacusprofundi Euryarchaeota - - 
Hyperthermus butylicus Crenarchaeota 2 56 
Methanococcoides burtonii DSM 6242 Euryarchaeota 2.56 40.8 
Methanococcus voltae Euryarchaeota - 31 
Methanocorpusculum labreanum Euryarchaeota - 50 
Methanoculleus marisnigri Euryarchaeota - 61 
Methanogenium frigidum Ace-2 Euryarchaeota 2:5 - 
Methanosarcina thermophila Euryarchaeota - 42 
Methanospirillum hungatei Euryarchaeota - - 
Methanothermococcus thermolithotrophicus Euryarchaeota - 34 
Natrialba asiatica Euryarchaeota - 62.3 
Pyrobaculum arsenaticum DSM 13514 Crenarchaeota - 58.3 
Pyrobaculum calidifontis Crenarchaeota - 51 
Pyrobaculum islandicum Crenarchaeota - - 
Pyrolobus fumarii Crenarchaeota - 53 
Staphylothermus marinus Crenarchaeota - 35 
Thermofilum pendens Crenarchaeota - 57.4 
Thermoproteus neutrophilus Crenarchaeota - 56.2 


“Data adapted from http://www.ncbi.nlm.nih.gov/genomes/|proks.cgi as of 11 October 2005. 


Size estimates in Mb. 


the eucaryal nature of the archaeal transcription ma- 
chinery and revealed a complex network of regula- 
tory factors with both bacterial and eucaryal features 
(3). Whereas promoters are now well defined in ar- 
chaea, the identification of transcription terminators 
is still elusive. Analyses of a few transcripts in the 
1980s identified the presence of sequences resembling 
“bacterium-like” terminators at the 3’ end of a few 
archaeal genes. However, these observations were not 
confirmed by analyses at the whole-genome level. 


CHROMOSOME ORGANIZATION 
Replication Origins 


The impact of genomics on archaeal research 
was especially critical for the field of DNA replica- 
tion. Prior to genome sequencing, nothing was known 
about chromosomal replication origins in archaea 
and only a handful of proteins that were possibly in- 
volved in DNA replication had been identified. The 
availability of complete genome sequences allowed 
the localization of DNA replication origins in several 
archaeal genomes to be predicted by in silico analy- 
ses (see Chapter 3). In bacterial genomes, the leading 
strand is enriched in guanine (G) residues in compar- 


ison with the lagging strand (Fig. 1a) (61). It has been 
proposed that GC skews may compensate for cyto- 
sine deamination on single-stranded DNA at replica- 
tion forks (38, 103). Short, specific sequences of a few 
nucleotides (words) are also unevenly distributed be- 
tween the two strands (9). In E. coli, octamers that 
exhibit a strong bias in their distribution between the 
two strands are recognition signals for the primase 
DnaG and for the RecBCD recombination system 
(Chi sequences) (9). Since the leading strand becomes 
the lagging strand both at the origin and terminus of 
DNA replication (oriC and terC, respectively) (Fig. 
1b), several studies showed that it was possible to 
localize the origin and terminus of DNA replication 
in bacteria by scanning (from an arbitrary position) 
any complete genome sequence for either GC or spe- 
cific word skews (Fig. 1c) (60). 

The first direct application of GC skew analysis 
to the genome of an archaeon, M. jannaschii, sug- 
gested many origins (86). However, by using a more 
refined method (cumulative GC or word skews), a 
single origin was predicted in the genomes of Metha- 
nothermobacter thermautotrophicus and Pyrococcus 
horikoshii (63). In both cases, graphs of cumulative 
GC or word skews exhibited two inverted peaks on 
each side of the base line (with the putative oriC 
located at the bottom of the negative peak in GC 
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Cumulative skew 


Va 


Arbitrary start 


Genomic Position 


Figure 1. (a) A schematic drawing of a DNA replication fork in Bacteria and Archaea. The leading strand (thick line) is enriched 
in guanine residues or specific nucleotide words with respect to the lagging strand (thin line). (b) At the origin and terminus 
of DNA replication (oriC and terC, respectively) the leading strand becomes the lagging strand, and vice versa. (c) It is possi- 
ble to localize the origin and terminus of DNA replication by scanning from an arbitrary position any complete genome se- 
quence for either GC or specific word skews. Cumulative skews exhibit two inverted peaks on each side of the baseline, which 


correspond to the origin and terminus of replication. 


skew analysis) (Fig. 2), indicating that these archaea 
replicate their genome from a single origin with two 
replication forks moving in opposite directions, as in 
bacteria. Bidirectional replication from a single origin 
was experimentally validated in another Pyrococcus 
species, P. abyssi (80, 87). Collectively, these results 
indicate that, although archaea possess a eucarya-like 
replication machinery, the selection pressure respon- 
sible for GC skews has affected them in a manner 
similar to bacteria. 

Computational studies using cumulative GC 
skews, or a more exhaustive method (Z curves), al- 
lowed identifying additional oriC sites in archaea. 
Z curves (their name is derived from their zigzag 
shape) are three-dimensional curves that take into ac- 
count all possible skews in base composition (purine 
versus pyrimidine [RY], amino versus keto [MK], or 
weak versus strong hydrogen bond bases [WS]). For 


a particular genome, the Z curve produces several dif- 
ferent graphs (disparity curves) with peaks that cor- 
respond to putative origins or termini of replication 
(cumulative GC skews can be seen as a special case of 
Z curves). As a consequence, several origins that are 
not detected by GC skew analyses can be discovered 
by other skews revealed by Z curves. This method was 
especially successful for identifying a previously un- 
detected, putative oriC in the genome of M. jannaschii 
(119). Software to draw and manipulate Z curves is 
available free online at http://tubic.tju.edu.cn/zcurve/. 

Putative oriC sites have been detected in 15 of 
19 completely sequenced archaeal genomes tested 
(117) (Table 3). Computational methods have failed 
to predict any oriC sites in the genomes of Methano- 
pyrus kandleri, N. equitans, Archaeoglobus fulgidus, 
and Sulfolobus tokodaii (117). This is probably due 
to recent rearrangements that have perturbed the 
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Figure 2. Graphs of cumulative skew and GC skew for Methanobacterium thermautotrophicum and Pyrococcus horikoshii. Pu- 


tative oriC in both species are indicated by arrows. 


skew in these particular genomes. A putative oriC 
was predicted in A. fulgidus by marker frequency 
analysis (70). This method is based on the fact that 
the copy number of a particular gene in an exponen- 
tially growing cell population is directly related to its 
position relative to the replication origin (Fig. 3). Sur- 
prisingly, two origins of replication were predicted in 
the genome of Halobacterium NRC1 (54), and three 
origins were recently detected in the genome of Sul- 
folobus solfataricus and S. acidocaldarius by marker 
frequency analysis on DNA chips (67). The three ori- 
gins of S. solfataricus have been validated by two-di- 
mensional gel electrophoresis (99). 

The identification and experimental validation of 
archaeal oriC sites allowed analysis of their sequences 
and genomic contexts. Archaeal oriC sites are char- 
acterized by the presence of 13mer repeats, AT-rich 


regions, and long inverted repeats (63, 87, 99, 119). 
Most archaeal oriC sites are located in intergenic re- 
gions just upstream of the genes encoding the ho- 
mologs of eucaryal initiation factors Cdc6 and Orc1 
(see Chapter 3). These proteins (hereafter referred to 
as Orc1/Cdc6) bind to archaeal oriC, both in vivo and 
in vitro (80, 99). Several archaeal genomes (and halo- 
archaeal plasmids) encode more than one Orc1/Cdc6 
protein (up to 9 in Halobacterium NRC1), and the 
presence of an orc1/cdcé6 gene is not necessarily cor- 
related with the presence of an origin of replication 
(and vice versa). For example, although Sulfolobus 
genomes contain three orc1/cdc6 genes, only two of 
the three oriC are located close to an orc1/cdc6 gene 
(66). Surprisingly, in Halobacterium NRC1, two orc1/ 
cdcé genes are located at the peaks that normally cor- 
respond to terC positions in cumulative GC skew 
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Table 3. Complete archael genome replication origin identification 


Organism name Group Status of replication origin identification 
Aeropyrum pernix K1 Crenarchaeota Unknown 
Archaeoglobus fulgidus DSM 4304 Euryarchaeota Approximate location known based on 


Halobacterium sp. NRC-1 
Methanocaldococcus jannaschii DSM 2661 
Methanococcus maripaludis $2 
Methanopyrus kandleri AV19 
Methanosarcina acetivorans C2A 
Methanosarcina mazei Go1 
Methanothermobacter thermautotrophicus 
strain Delta H 
Nanoarchaeum equitans Kin4-M 
Picrophilus torridus DSM 9790 


Pyrobaculum aerophilum strain IM2 
Pyrococcus abyssi GES 


Pyrococcus furiosus DSM 3638 


Pyrococcus horikoshii OT3 


Sulfolobus solfataricus P2 


Sulfolobus tokodaii strain 7 
Thermoplasma acidophilum DSM 1728 


Thermoplasma volcanium GSS1 


Euryarchaeota 
Euryarchaeota 
Euryarchaeota 
Euryarchaeota 
Euryarchaeota 
Euryarchaeota 
Euryarchaeota 
Nanoarchaeota 
Euryarchaeota 


Crenarchaeota 
Euryarchaeota 


Euryarchaeota 


Euryarchaeota 


Crenarchaeota 


Crenarchaeota 
Euryarchaeota 


Euryarchaeota 


marker frequency (70) 

Two replication origins predicted (54), one 
identified in vivo (8) 

One replication origin identified with 
Z-curve method (119) 

Unknown 

Uncertain 

One replication origin identified with 
GC skew analysis (D. Cortez, personal 
communication) 

One replication origin identified with 
Z-curve method (119) 

One replication origin identified 
with word skew analysis (63) 

Unknown 

Unknown 

Unknown 

One replication origin identified with 
word skew analysis (63), and confirmed 
in vivo by 2D gel analysis (87) 

One replication origin identified with 
word skew analysis (63) and comparison 
with close homolog P. abyssi 

One replication origin identified with 
word skew analysis (63) and comparison 
with close homolog P. abyssi 

Two replication origins identified in 
vivo (99), third replication origin 
identified by markers frequency analysis (66) 

Unknown 

One replication origin identified with 
GC skew analysis (D. Cortez, personal 
communication) 

Unknown 


analyses (54). One of these peaks is a bona fide origin 
that has now been experimentally validated in vivo 
(8). The GC skew is therefore inverted in Halobac- 
terium salinarium compared with other organisms 
(i.e., the leading strand being enriched in C instead 
of G). The same GC skew was observed in the bac- 
terium Streptomyces coelicolor (6). It is unclear if 
these inversions are related to the very high GC con- 
tent of both genomes (71% in S. coelicolor and 66% 
in Halobacterium NRC1) or to both genomes having 
experienced frequent rearrangements triggered by dele- 
tions and insertion of large plasmids. In a few archaeal 
genomes, the genes around oriC include genes encod- 
ing putative DNA replication proteins other than orc1/ 
cdc6. The most striking example is a replication island 
in Pyrococcus that contains, in addition to orc1/cdc6 
genes, genes encoding the two subunits of the DNA 
polymerase D family, and the two subunits of the 
clamp-loading factor, RFC (87). Other putative ori- 


gins have been localized close to genes encoding DNA 
primase or DNA polymerases subunits (63, 118). 


Replication Termini and Replichore Organization 


In archaea with a single replication origin, the 
chromosome is divided in two replichores (left and 
right) of equal sizes by oriC and terC, indicating that 
the two replication forks move at the same rate. The 
four replichores of the H. salinarium NRC1 genome 
also have the same size (54), but this is not the case 
for the six replichores of Sulfolobus species (66). The 
three oriC of Sulfolobus are initiated simultaneously 
in a population of synchronized cells and move at the 
same rate. Therefore, termination is not simultaneous, 
and short replichores are completely replicated be- 
fore larger ones. This feature partially resembles the 
eucaryal mode of replication, where multiple replicons 
are replicated at different times of the cell cycle. 
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Figure 3. Marker frequency analysis. In a population of nonsynchronized cells, the copy number of a particular gene is directly 
proportional to its closeness to the origin of replication (oriC). Boxes represent genes. Gene 1 is the closest to oriC and the most 


represented. 


The regions corresponding to archaeal terC have 
not yet been characterized at the sequence level. They 
seem to be less well defined than oriC, since the peaks 
of cumulated GC skew or Z-curve graphs correspond- 
ing to terC are usually broader than those correspond- 
ing to oriC (Fig. 2). Proteins known to be involved in 
the termination of DNA replication in E. coli and 
Bacillus subtilis (Tas and RTP, respectively) are not ho- 
mologous to each other and have homologs only in 
closely related species. The mechanism of DNA repli- 
cation termination is completely unknown in eucarya, 
and it remains a mystery in archaea (see Chapter 2). 

All archaeal genomes encode a single protein 
that is homologous to bacterial paralogs XerC and 
XerD. In bacteria, these proteins are involved in the 
resolution of dimeric chromosomes that are produced 
by recombination during bacterial DNA replication 
(60). XerCD proteins recognize specific sites, referred 
to as dif (deletion induces filamentation) sites, which 
are located near terC. Sequences with similarity to dif 
sites, or other putative Xer-binding sites, have not 
been identified in archaeal genomes. It would be es- 
pecially interesting to purify archaeal Xer proteins in 
vitro to search for their specific binding sequences. 

The chromosome terminus appears to be a hot 
spot of recombination in archaea, as it is in bacteria. 
This was clearly shown from a genome comparison of 
the two closely related species, P. abyssi and P. hori- 
koshii. The terminus region has in fact experienced 
several rearrangements after the divergence of these 
two species, including the translocation of two regions 
of synteny (conservation of gene order) (Fig. 4A, frag- 
ments E and F) (120). 


Organization of Transcription Units 
at the Whole-Genome Level 


Similar to bacteria, highly expressed genes (in 
particular, those encoding rRNA and ribosomal pro- 
teins) in archaea are frequently located close to oriC to 
increase their copy number in actively replicating cells 
(see references 100 and 101 and references therein). 
There is also a general tendency to have more genes 
orientated in the direction of DNA replication than 
in the opposite direction (see references 100, 101 and 
references therein). This phenomenon induces a skew 
of transcription (more genes are transcribed on the 
leading strand). Cumulative transcription skew analy- 
sis in bacteria and archaea often produces graphs that 
are quite similar to those obtained with GC skews, 
since this skew also changes orientation at oriC and 
terC (63, 120). The strength of the colinearity between 
transcription and replication is very variable in differ- 
ent bacteria (82), and this is probably also the case 
for archaea. This trend is even reversed in P. furiosus, 
where slightly more genes are transcribed in the op- 
posite direction to DNA replication (120). This could 
be due to recent, extensive rearrangements in this ar- 
chaeon (see “Mechanisms of genome evolution,” be- 
low). However, a clear colinearity between replication 
and transcription was observed in the three Pyrococ- 
cus species when the analysis was restricted to highly 
expressed genes (120). 

The correlation between colinearity in transcrip- 
tion and replication and a high level of gene expres- 
sion has also been observed in many bacterial genomes 
(102). Since a replication fork has more chance to en- 
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Figure 4. (A) A schematic view of the genome organization of the two closely related species Pyrococcus abyssi and Pyrococcus 
horikoshii. Fragments E and F in the terminus region show a translocation between the two species, while fragment A, con- 
taining oriC, and fragment C show an inversion. (B) Comparative BLAST-hit plot of the two genomes. (Adapted from reference 


120 with permission.) 


counter an RNA polymerase transcribing a highly ex- 
pressed gene in its trajectory along the chromosome, it 
was thought that the main purpose of this organiza- 
tion was to avoid head-to-head collisions between 
replication forks and transcription forks to prevent the 
loss of transcripts, the reduction of replication rate, 
and/or the collapse of replication forks (10). However, 
from the systematic gene disruption analysis of protein 
function in B. subtilis, it was recently shown that the 
strongest trend was for transcription of essential genes 
in the direction of DNA replication (102). Therefore, 
the purpose of colinearity in bacteria (as represented 
by B. subtilis) is probably to prevent the abortive tran- 
scription of essential genes to avoid the accumulation 
of truncated mRNA or proteins that could be delete- 
rious to the cell. As essential genes are often highly 
transcribed (but not always), the correlation between 
the level of transcription and transcription/replication 


colinearity could be indirect. It will be very interest- 
ing to determine if the same rule exists in archaea. This 
will require the systematic disruption of all genes in 
at least one archaeon. 


Repeated Sequences 


Archaeal genomes contain various types of re- 
peated sequences. Most archaeal genomes contain in- 
sertion sequences (IS) that may represent active mobile 
elements or relics of past invasion by transposable el- 
ements (see Chapter 5). Active IS are especially abun- 
dant in S. solfataricus and Halobacterium NRC1 (201 
and 48 copies, respectively) (15), where they are re- 
sponsible for the well-known genetic instability of 
these strains. In the case of S. solfataricus, they con- 
stitute more than 10% of the genome sequence (23). 
These IS elements belong to various families (some of 
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them including bacterial homologs), indicating that 
they may play a major role in promoting HGT be- 
tween the two domains via illegitimate recombination 
between the archaeal chromosome and incoming bac- 
terial DNA fragments (or vice versa). 

A second class of mobile elements that are only 
present in a few archaeal genomes (mainly Sulfolobus), 
are MITEs (Miniature Inverted Repeat Transposable 
Elements) (15) (see Chapter 5). MITEs lack open 
reading frames (ORFs), indicating that they depend 
on host-encoded transposases for their mobility. 
These small elements (~50 to 400 bp) are common in 
eucaryal and bacterial genomes and could be derived 
from ISs as a result of deletion within an IS element 
or acquisition of terminal inverted repeats. The high 
abundance of MITEs in S. solfataricus and S. toko- 
dai genomes (143 and 61, respectively) correlates 
with the large number of IS elements in these species 
(201 and 34, respectively) (15). 

A very different group of nonautonomous re- 
peated sequences found in most archaeal genomes 
(both Euryarchaeota and Crenarchaeota) are clusters 
of short repeated sequences (20 to 40 bp) separated by 
spacers of similar size and variable noncoding se- 
quences (Fig. 5). These clusters, that are also present 
in several bacterial lineages, have been described as 
SRSRs (Short Regularly Spaced Repeats) (84), LCTRs 
(Large Cluster of Tandem Repeats) (120), or CRISPRs 
(Clustered Regularly Interspaced Palindromic Short 
Repeats) (52). The term CRISPR has been adopted for 
this chapter. These repeats are conserved between 
closely related species but are quite variable between 
different archaeal and bacterial lineages. The number 
of CRISPRs and their location in the genome can vary. 
For instance, 4, 6, and 7 CRISPRs are present at dif- 
ferent locations in the genomes of P. abyssi, P. hori- 
koshii, and P. furiosus, respectively (Fig. 4) (120). 

CRISPRs are transcribed, suggesting a regulatory 
role via antisense RNA, and a CRISPR-binding pro- 
tein has been purified from S. solfataricus (94). Clues 
to the origin of CRISPRs came from the finding that 
some spacer sequences are homologous to sequences 
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Figure 5. Schematic structure of a typical CRISPR. 


from plasmids, viruses, or integrated prophages (97). 
Furthermore, it has been observed that archaeal viruses 
or conjugative plasmids harboring CRISPRs, appar- 
ently cannot infect cells containing CRISPRs with ho- 
mologous spacer sequences (83). This suggests that 
CRISPRs in cellular genomes are of viral (plasmid) 
origin and are now part of an immunity system that 
prevents invasion by further foreign elements. This hy- 
pothesis is supported by experimental evidence in 
Haloarchaea that exhibit incompatibility between 
replicons that contain similar CRISPRs (83). Genes 
that are present in the vicinity of CRISPRs (referred 
to as cas genes, for CRISPR associated genes) (52) be- 
long to a cluster of genes predicted to encode a new 
and “mysterious” DNA repair system (72, 97). These 
proteins, which include a putative helicase and a RecB 
homolog, are probably involved in the mobility of 
their associated CRISPRs. It has also been suggested 
that CRISPRs are important for chromosome parti- 
tioning (85). This hypothesis is supported by the pres- 
ence of CRISPRs located symmetrically with respect 
to oriC in P. abyssi and P. horikoshii, or to terC in the 
three Pyrococcus species (Fig. 4) and S. acidocaldarius 
(23, 120). The evolutionary origin of CRISPR is mys- 
terious since they are present in completely unrelated 
viruses, such as bacteriophages and viruses of hyper- 
thermophilic archaea (97). 


MECHANISMS OF GENOME EVOLUTION 
Replication-Driven Genome Rearrangements 


The sequencing of several genomes from closely 
related species is a very powerful approach to analyze 
the mechanism of genome evolution. For example, it 
was very informative to compare the three Pyrococ- 
cus genomes that were sequenced at the end of the 
1990s. The genomes of P. abyssi and P. horikoshii are 
sufficiently closely related to allow the identification 
of long regions of synteny, thereby enabling a precise 
analysis of the genomic rearrangements that occurred 
after the separation of these two species. The major 
rearrangement is the inversion of a large fragment 
containing oriC (Fig. 4A, fragment A). As oriC is lo- 
cated roughly at the middle of the inverted fragment, 
this inversion produces an X-shaped figure in a com- 
parative BLAST-hit plot of the two genomes (Fig. 4B). 
X-shaped figures often arise from the comparative 
analyses of closely related bacterial genomes (111). 
This indicates that inversions around oriC are more 
frequent than others that may occur and that they 
frequently occur between closely related bacteria. 
These types of inversions are probably favored since 
they maintain the same direction of transcription and 


CHAPTER 19 


e STRUCTURE AND EVOLUTION OF GENOMES 421 


replication (i.e., colinear) for all the genes present on 
these fragments. This suggests that the inversions take 
place between the two replication forks, moving at 
the same speed in opposite directions from the origin 
(111). If this model is correct, the observation of 
X-shaped figures in archaea suggests that a single 
replication “factory” exists that couples two replica- 
tion forks (similar to bacteria) (120). It has been pro- 
posed that type II DNA topoisomerases that are asso- 
ciated with replication forks may be involved in this 
process (76). Alternatively, recombination repair of a 
broken replication fork may use the daughter strand 
of the second fork that is present in the replication 
“factory” as template, instead of the daughter strand 
of the same fork. Irrespective of the mechanism, the 
X-shaped figures derived from closely related genomes, 
and the presence of extensive rearrangement at terC 
(see “Chromosome organization,” above), indicate 
that replication is a major force driving genome re- 
arrangement (111) that is common to both archaea 
and bacteria. 

Remarkably, comparative analysis of the cumula- 
tive skews of transcription between P. horikoshii and 
P. abysii allowed attributing specific rearrangements 
to a particular species (120). It was in fact inferred that 
the inversion of a small fragment inside the left repli- 
chore (Fig. 4A, fragment C), and a large translocation 
at the terminus (Fig. 4A, fragments E and D), both oc- 


AT skew 


Transcriptional skew 
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curred in the P. abyssi lineage (120). This conclusion is 
based on the observation that the cumulative tran- 
scription skew lines are disturbed at these two loca- 
tions in the P. abyssi graph, whereas they are smooth 
in the graph for P. horikoshii (Fig. 6). This means that, 
in P. abyssi, the genes present on these two fragments 
are transcribed in a direction opposite to that of tran- 
scription, while in P. horikoshii they kept the same di- 
rection that they had in the genome of the ancestor of 
these two Thermococcales. Note that the nucleotide 
skews for these two fragments in P. abyssi are smooth, 
in contrast to the marked alteration of the transcrip- 
tional skew line (Fig. 6). This indicates that composi- 
tional skews were fully restored in P. abyssi (at the 
time of genome sequencing). This rapid restoration is 
consistent with the general observation that strand 
bias in nucleotide composition is probably neutral and 
evolves fast (103). The rapid restoration of base com- 
position skews implies a huge modification in the 
genome sequence at the nucleotide level (with possi- 
ble accumulation of neutral mutations at the protein 
level). Genome shuffling thus appears to be a critical 
force in modifying DNA sequences, via selection pres- 
sure acting for the restoration of nucleotide skews. 
The main sequence of events that may occur in 
archaeal and bacterial genome evolution would thus 
be as follows: (i) replication-driven rearrangements 
inducing skew inversions, (ii) restoration of normal 
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Figure 6. Cumulative transcription skew lines are smooth at fragments E, F, and C in P. horikoshii, while they are disturbed 


in P. abyssi. (Adapted from reference 120 with permission.) 
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skews inducing sequence evolution, (iii) selection of 
sequence changes compatible with optimal function 
for an organism at a given time. It is important to test 
the validity of these assumptions by comparing other 
closely related archaeal genomes and to check the 
prediction that genome rearrangements may increase 
the rate of protein evolution. 


Recombination Induced by Transposition 


Besides rearrangements driven by DNA replica- 
tion, a major force shaping genome evolution in ar- 
chaeal and bacterial genomes is recombination induced 
by mobile elements (see Chapter 5). The presence of 
repeated sequences can also promote homologous 
recombination inside the same chromosome, or be- 
tween different chromosomes in species containing 
multiple chromosomal copies. Comparing the three 
Pyrococcus species, and extending this to the recently 
published sequence of Thermococcus kodakaraensis, 
has been very informative for unraveling the mecha- 
nisms of genome evolution. The extensive studies of 
Sulfolobales genomes and their extrachromosomal el- 
ements have also been extremely enlightening for this 
purpose (16, 23, 107). Many relics of tRNA genes 
and insertion sequences have been detected at the 
border of recombined fragments between different 
Pyrococcus and Sulfolobus species, indicating that the 
majority of small rearrangements were mediated by 
mobile elements that integrated in tRNA genes (59, 
69, 120). In the case of Pyrococcus, many rearrange- 
ments are also associated with genes encoding en- 
zymes involved in restriction/modification (RM) sys- 
tems, suggesting that RM genes themselves behave as 
mobile elements and cause genome rearrangements 
(24). Genome shuffling by transposon-driven chro- 
mosomal rearrangements can induce a strong pertur- 
bation in the usual patterns of genome organization. 
The genomes of S. solfataricus and S. tokodai contain 
many ISs and are both extensively shuffled. Similarly, 
chromosomal rearrangements have been more impor- 
tant in the evolution of the genome of P. furiosus than 
in the two other Pyrococcus species (120), and this 
could be related to the higher number of transposases 
in P. furiosus, than in P. abyssi and P. horikoshii. Re- 
cent invasion by transposases could explain the in- 
verted transcriptional skew in P. furiosus (120).The 
major rearrangements that occurred between P. fu- 
riosus and the two other pyrococci (four inversions 
and four transpositions), occurred within one repli- 
chore (120). This could reflect the existence of a phys- 
ical or topological barrier that could prevent recom- 
bination by transposition between DNA fragments 
present on different replichores (120). However, this 
could also simply reflect the fact that these transposi- 


tions preferentially occur at relatively short distances 
on the same chromosome, for structural or mechanis- 
tic reasons. 


Gene Loss and Acquisition 


Comparative genomics has shown that gene loss, 
gene duplication, and integration of foreign DNA are 
major forces shaping genome evolution. Although gene 
loss is difficult to quantify, it may be a frequent and 
ongoing process in archaeal genomes. Indeed, it was 
recently shown that as much as 15% of archaeal ribo- 
somal proteins have been lost in different archaeal lin- 
eages (59). Gene loss and nonorthologous replacement 
seem to be frequent in the components of archaeal 
metabolic pathways (71). Several protein family expan- 
sions identified in complete archaeal genomes have 
been linked to different physiological requirements 
(71). For example, nearly half of all annotated genes in 
the metabolically versatile methanogen, M. acetivorans, 
belong to one of 539 multigene families (42). 

Integration of foreign elements (plasmids, viruses) 
containing ISs, or activation of resident transposons, 
can directly induce gene loss, acquisition, or inversion. 
The continuous evolution of archaeal genomes by 
gene loss and acquisition is obvious from the compar- 
ative analyses of the proteomes of closely related 
species (23, 25, 39). There are always a number of 
proteins that are absent in close relatives. This clearly 
indicates either recent acquisition by HGT or recent 
loss in these relatives. The genes that appear to have 
recently integrated in archaeal genomes sometimes en- 
code for proteins with homologs in more distantly re- 
lated archaeal or bacterial species. In the genome of 
P. furiosus, a 16-kb fragment is flanked by two IS ele- 
ments (32) (see Chapters 16 and 20). This region is 
absent from the two other Pyrococcus species, but a 
very similar fragment is present in Thermococcus lit- 
toralis, indicating a recent transfer between Thermo- 
coccales (32). The identification of bacterial-like genes 
in P. abyssi, which are absent in the two other Pyro- 
coccus species, suggests a very recent transfer from 
bacteria (26). 

Despite these examples, most species-specific pro- 
teins have no homologs in sequence databases, and 
many are likely to be of plasmid or viral origin. This 
is consistent with the extraordinary abundance of 
these extrachromosomal elements in nature, and with 
the general observation that most proteins encoded 
by viral or plasmid DNA (both in archaea and bacte- 
ria) have no homologs in databases. This trend is ex- 
treme for some archaeal viruses. For instance, the 
40- kb genome of PSV (Pyrobaculum Spherical Virus) 
contains 49 putative orphan ORFs (none of them 
with a predicted function) (47). There is likely to be 
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a huge reservoir of viral and plasmid genes in the 
biosphere that are continuously providing new genes 
to archaeal and bacterial genomes. The time of resi- 
dence of these orphan genes in cellular genomes may 
be very brief, and they may be rapidly lost. They 
would rarely provide a substantial selective advantage 
to the host and then become fixed in the cellular 
genome. The same fate is expected for genes of cellu- 
lar origin that are carried by extrachromosomal ele- 
ments and transferred to a new genome by transduc- 
tion, conjugation, or transformation. 

HGT has occurred between different archaeal 
lineages, between Archaea and Bacteria, and also pos- 
sibly from Archaea to Eucarya; the latter is suggested 
by the presence of a homolog of archaeal DNA topoi- 
somerase VI that is otherwise exclusively found in 
plants (41), as well as homologs of archaeal alanyl- 
and prolyl-tRNA synthetases in two protist lineages 
(2). The percentage of genes acquired by HGT greatly 
varies from one archaeon to the other. Methanosarci- 
nales, Haloarchaea, and marine Crenarchaeota ap- 
pear to contain a large proportion of genes of bacter- 
ial origin that may have been used to expand their 
metabolic repertoire and for adaptation to mesophily 
(31, 54, 64). The important role that HGT has played 
in the adaptation of archaea to extreme environments 
is discussed below (see “Genomics and adaptation to 
extremophily”). 


Fast Genome Evolution 


In some cases, specific mechanisms (and/or selec- 
tion pressure) can greatly accelerate the evolution 
of a particular genome. The most spectacular case 
of rapid genome evolution in the Archaea is that of 
N. equitans. The genome of N. equitans is highly re- 
duced, probably following adaptation to its symbi- 
otic/parasitic lifestyle. It has lost all enzymes involved 
in metabolic pathways, including those required for 
lipid biosynthesis. N. equitans thus relies completely 
on the metabolism of its host Ignicoccus (113). Most 
spectacularly, 100 of the 301 genes otherwise present 
in all archaeal genomes are missing in N. equitans 
(74). This reduction was coupled to a complete loss of 
operon structure and the split of several genes includ- 
ing those coding for tRNAs. The presence of several 
tRNA genes split in two at the position of an intron 
insertion in the anticodon loop, and of an intein in 
one of the split protein genes, suggests that these types 
of insertion elements may have made important con- 
tributions to the evolution of the N. equitans genome. 
However, the N. equitans genome appears today sta- 
ble since no pseudogenes are found (113). 

Another case of rapid genome evolution in the 
Archaea is for M. kandleri. Although M. kandleri is 


a free-living organism (unlike N. equitans), its genome 
is highly unusual. In addition to the absence of a nor- 
mal genomic skew pattern (Table 3 and “Chromo- 
some organization,” above), it contains many orphan 
genes, as well as many split and fused genes (109). 
Phylogenetic analyses have shown that M. kandleri 
RNA polymerase subunits have evolved more rapidly 
than in other archaea. This was inferred from the long 
branches displayed by M. kandleri in RNA polymerase 
phylogenies, and from the presence of a higher num- 
ber of indels (insertions/deletions) in their sequences 
than any other archaeal species that was analyzed 
(13). It was suggested that the rapid evolution of the 
M. kandleri genome may be related to the loss in this 
archaeon of transcription elongation factor, TFS (other- 
wise present in all archaea) (13). This factor is evolu- 
tionarily related to the eucaryal factor, TFIS, which 
is involved in the fidelity of transcription and in the 
bypass of DNA lesions. TFIIS and TFS stimulate an 
intrinsic RNase activity of RNA polymerase that 
cleaves the 3’ end of nascent messenger RNAs that are 
extruded at stalled forks, thus resuming transcription. 
The absence of TFS may influence the rate of genome 
evolution by an as-yet-unknown mechanism. For ex- 
ample, it may increase homologous recombination to 
promote transcription by stalled transcription ma- 
chinery, and/or somehow affect DNA repair. Such an 
unsuspected link between transcription and evolu- 
tionary rate, inferred from purely in silico analyses, 
highlights the need for experimental investigation. 


GENOMICS AND ADAPTATION 
TO EXTREMOPHILY 


Insight from Protein Content 
and Comparative Genomics 


Most archaea that have been studied experimen- 
tally live in environments that are considered to be ex- 
treme (such as high-salt concentration, low pH, high 
or low temperature) (see Chapter 2). For many years, 
the molecular mechanisms of adaptation to these con- 
ditions could only be studied at the biochemical level. 
The recent availability of archaeal genome sequences 
has greatly expanded the capacity to look for hall- 
marks of extremophily. 

Members of the Archaea are the only organisms 
that can thrive in hydrothermal habitats where tem- 
peratures range from 95 to 113°C. In general, al- 
though hyperthermophilic bacteria exist, archaea are 
predominant in all biotopes with temperatures above 
80°C. The connection between archaea and hyper- 
thermophily is thus pervasive, and it possibly dates 
back to a hyperthermophilic archaeal ancestor. Much 
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attention has therefore been given to possible clues of 
adaptation to hyperthermophily in archaeal genomes. 
Surprisingly, comparative genomics analysis has 
shown that all hyperthermophiles (archaea and bac- 
teria) share a single protein that is absent in all non- 
hyperthermophiles: reverse gyrase (34). This unique 
hyperthermophile-specific protein is formed by the fu- 
sion of a classical type I DNA topoisomerase and a 
large helicase-like domain. Reverse gyrase is the only 
DNA topoisomerase that can produce positive super- 
coiling into circular DNA in vitro. Its specific phylo- 
genetic pattern suggests a crucial role in thermo- 
adaptation. Consistent with this hypothesis, a mutant 
of Thermococcus kodakaraensis lacking reverse gy- 
rase grows more slowly than the wild type at 80°C 
and cannot grow above 90°C (4). 

It has been purported that reverse gyrase controls 
intracellular DNA topology in hyperthermophiles, and 
that positive supercoiling is essential for life at high 
temperature. This hypothesis was supported by ex- 
periments showing that viral or plasmid DNA iso- 
lated from hyperthermophiles were either relaxed or 
positively supercoiled, compared with the negatively 
supercoiled plasmids isolated from mesophilic bacteria 
or archaea (22). This unusual topology was thought 
to reflect the topology of the chromosome itself. Con- 
sistent with this idea, the genome sequences of ar- 
chaeal hyperthermophiles exhibit a sequence period- 
icity of ~10 bp (much lower that the 11 bp typically 
observed in bacterial genome sequences), suggesting a 
shorter helical path induced by positive supercoiling 
(49). However, further experimental work showed 
that A. fulgidus, which displays a sequence periodicity 
of 10, contains DNA gyrase, in addition to reverse gy- 
rase, and harbors a negatively supercoiled plasmid 
(65). It was also found that M. kandleri (an archaeon 
with only reverse gyrase) has a sequence periodicity of 
10.9 (105). The sequence periodicity of a given genome 
is therefore unlikely to correlate with the level of DNA 
supercoiling. On the other hand, the presence of neg- 
atively supercoiled plasmids in hyperthermophiles 
harboring DNA gyrase, such as A. fulgidus and Ther- 
motoga maritima (45), indicates that hyperthermo- 
philes can live with this type of DNA topology and 
that gyrase dominates over reverse gyrase in deter- 
mining intracellular DNA topology. In summary, it 
seems that reverse gyrase does not play a major role 
in determining intracellular DNA topology. 

The assumption that reverse gyrase helps to sta- 
bilize DNA at high temperature is also probably 
wrong, since intracellular DNA is highly resistant to 
denaturation (77). Recent data suggest that reverse gy- 
rase may be involved in some form of DNA protection 
against thermodegradation, possibly coupled to DNA 
repair (53, 88). The relationships between such activ- 


ity, and the positive supercoiling activity observed in 
vitro, is presently unclear. 

By relaxing the protein identification criteria, it is 
possible to identify proteins that are not unique to hy- 
perthermophiles but do exhibit a phylogenomic pro- 
file with preference for hyperthermophiles (72, 75). In 
particular, a cluster of genes that is overrepresented 
in hyperthermophilic archaea and bacteria and is pre- 
dicted to form a novel DNA repair system important 
for growth at very high temperature (72). However, as 
described above (Repeated sequences), this cluster in- 
cludes the Cas proteins that appear to be linked to the 
propagation of CRISPR sequences. Since CRISPRs are 
also overrepresented in hyperthermophilic genomes, it 
is not clear whether Cas proteins are really involved in 
high-temperature adaptation. 

Comparative genomics has shown that, for adap- 
tation to extreme environments, microorganisms have 
benefited to a great extent from HGT. This is exem- 
plified by the high number of archaea-like proteins 
that are present in hyperthermophilic bacteria, includ- 
ing reverse gyrase (37, 89). If the common ancestor 
of all archaea was a hyperthermophile (as suggested 
by phylogenetic analyses of reverse gyrase, ribosomal 
proteins, and RNA polymerase subunits [12, 37]), 
low-temperature environments can be considered ex- 
treme for archaea! Indeed, archaea have probably re- 
cruited many proteins from mesophilic bacteria dur- 
ing their adaptation to mesophily. This is strongly 
suggested by the existence of a nearly unidirectional 
flux of HGT from mesophilic bacteria to mesophilic 
archaea (115). Proteins of bacterial origin are espe- 
cially abundant in mesophilic methanogens and in 
members of the Halobacteriales. 

HGT appears to have been especially prevalent 
between thermoacidophiles. For example, the eur- 
yarchaeon Picrophilus oshimae (Thermoplasmatales) 
shares 35% of its genes with pyrococci (another eu- 
ryarchaeon), but 58% with the crenarchaeon, S. sol- 
fataricus. Even more strikingly, 13% of the proteins 
shared by P. oshimae and S. solfataricus are absent in 
Thermoplasma acidophilum, indicating that genes 
have recently been transferred from Sulfolobus to Pi- 
crophilus (reviewed in reference 25). 

A core of 690 genes common to all thermoaci- 
dophiles (Thermoplasmatales and Sulfolobales) have 
been identified from a comparative genomic analysis 
(40). Many proteins with specific affinity for thermo- 
acidophiles are transporters that could exploit the 
large transmembrane pH gradient present in these mi- 
croorganisms (see Chapter 16). Transporters may also 
function at low pH as uncouplers of the respiratory 
chain and may therefore be involved in the removal 
of organic acids that are harmful to acidophiles (25, 40). 
In particular, P. oshimae has an intracellular pH close 
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to 4.6 and contains many enzymes for the degradation 
and transformation of these compounds. Proteins ap- 
parently linked to the thermoacidophile phenotype 
tend to be more similar to enzymes from Crenarch- 
aeota or bacteria than to proteins of other Euryarch- 
aeota, again highlighting an important role for HGT 
in adaptation to new environments. 

Haloarchaea exhibit the extraordinary property 
of accumulating very high concentrations of intracel- 
lular K* (up to 3 to 4 M). Consistent with this, the 
genome of Halobacterium NRC1 has a high number 
of active K* transporters and potential pumps in- 
volved in Nat efflux (90) (see Chapter 16). Three 
genomes of halophilic archaea are now available, and 
a comparative analysis of genomes from halophilic 
microorganisms should be available soon. It will be 
particularly interesting to obtain the genome se- 
quence of an extremely halophilic bacterium such as 
Salinibacter ruber to compare the mechanisms of 
high-salt adaptation between archaea and bacteria. 


Insights from Amino Acid Composition 


Analysis of the whole proteome of Halobacterium 
NRC1 confirmed that proteins of halophilic archaea 
are extremely acidic (basic proteins are essentially ab- 
sent), with acidic amino acids concentrated on the 
protein surface (54). In contrast, analysis of the pre- 
dicted proteome of the thermoacidophile, Picrophilus 
torridus, failed to detect any significant bias in amino 
acid composition, even though its proteins are ex- 
posed to an intracellular pH of about 4.6 (40). 

With respect to thermal adaptation, the proteomes 
of hyperthermophiles appear to be especially enriched 
in glutamate, lysine, valine, isoleucine, and tyrosine, 
and depleted in asparagine, glutamine, histidine, ala- 
nine, and threonine (46, 108). The decrease in aspar- 
agine, glutamine, and histidine is probably correlated 
to the sensitivity of these amino acids (especially Asn) 
to thermodegradation (112). A comparison between 
the complete, predicted proteomes of 53 mesophiles, 
9 thermophiles, and 9 hyperthermophiles revealed a 
clear trend in the polar index of amino acids, increas- 
ing charged (aspartate, glutamate, lysine, arginine) 
versus polar (noncharged) amino acids, (asparagine, 
glutamine, serine, threonine) (19, 110) (Fig. 7). The si- 
multaneous increase of positively and negatively 
charged amino acids fits a priori well with previous 
biochemical data obtained from the comparative 
structural analysis of hyperthermophilic proteins with 
their mesophilic homologs. These biochemical studies 
have indicated that a major determinant in the stabi- 
lization of proteins at very high temperature is the 
formation of networks of ion pairs at their surface 
(98). However, this interpretation based on the polar 
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Figure 7. The percentage of various classes of amino acids in gene 
coding sequences from hyperthermophiles and mesophiles. Percent- 
ages were calculated for 9 fully sequenced hyperthermophilic and 
53 mesophilic bacteria and archaea. Grey and black vertical bars in- 
dicate the mesophiles and the hyperthermophiles, respectively. 
The y axis indicates percentages. CHA, charged; POL, polar; 
ALIPH, aliphatic; AROM, aromatic. 


index of amino acids was partially challenged by a re- 
cent study using unfolding simulation experiments 
that showed that hyperthermophilic proteins are 
structurally adapted to high temperature through 
compactness and higher residue hydrophobicity (7). 
In this study, the number of charged residues in hy- 
perthermophiles was reported to be much greater than 
it would need to be for the stabilization by ion pairs 
(7). The high polar index observed in the genome of 
hyperthermophilic archaea could thus have an alter- 
native purpose (see “Nucleotide composition and 
codon index,” below). The trends between tempera- 
ture and amino acid polarity were confirmed in a 
comparative analysis of genomes from psychrophilic 
archaea, which showed that proteins from cold- 
adapted archaea are characterized by a high content 
of noncharged polar amino acids (104). 


Nucleotide Composition and Codon Index 


Genomics data have confirmed preliminary ob- 
servations that G+C content of tRNA and rRNA in- 
creases, as expected, with higher optimal growth tem- 
perature (OGT), whereas the genomic G+C content 
is not correlated to OGT (43). In a recent review, 
Hickey and Singer stated that “the obvious question 
that comes to mind is why we observed the expected 
correlation between nucleotide content and growth 
temperature in the paired region of RNA molecules 
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(cRNA, tRNA) but not in double-stranded DNA?” 
(50). The answer is that, differently from tRNA and 
rRNA, which are topologically open structures (77— 
79), intracellular DNA can be assumed to be a topo- 
logically closed DNA, because the two DNA strands 
cannot freely rotate around each other in the cell (for 
instance, because of the inertia of large transcription 
complexes bound by ribosomes, especially those 
translating membrane proteins anchored to the cyto- 
plasmic membrane). As a consequence, intracellular 
DNA is intrinsically resistant to thermodenaturation 
and does not need to be stabilized by high GC. In par- 
ticular, it was shown that a circular double-stranded 
plasmid (topologically closed DNA) remains double 
stranded up to at least 107°C, whereas the same plas- 
mid, once linearized, denatures at 78°C (77). 

A high genomic GC content could also stabilize 
codon-anticodon interaction. Two important papers 
have clarified the role of GC content in codon usage 
by looking at the possible preference for synonymous 
codons in relation to OGT. It was recently shown that 
hyperthermophiles exhibit a preference for C and G 
ending codons, but only when the first two positions 
are AT base pairs (5). In particular, a strong prefer- 
ence was noticed for AAG among arginine codons. 
In contrast, C or G is avoided at third positions when 
the second base pair is G or C. This can be explained 
by a general rule preventing avoidance of adjacent 
GC base pairs in anticodon because they are probably 
too sticky for correct codon-anticodon interactions 
(5). It was independently reported that synonymous 
codon usage is subject to selection in hyperthermo- 
philes, and a strong preference exists for AGR, CGN 
(arginine) and ATH (isoleucine) codons (68). The dif- 
ference in codon usage between hyperthermophiles 
and mesophiles was more pronounced in highly ex- 
pressed genes (68). A conclusion of the study was that 
codon preferences in hyperthermophiles may be re- 
lated to mRNA stability, rather than specific codon- 
anticodon base pairing (68). 

A global analysis has shown that archaea and 
bacteria tend to acquire A and lose C at high temper- 
ature, while keeping T and G relatively constant (57). 
As a consequence, the genome sequences of hyper- 
thermophiles are enriched in purine (purine loading) 
with a marked preference for adenine and a decrease 
in cytosine (57, 58, 108). It has been speculated that 
the increase in purine could minimize unnecessary 
RNA-RNA interactions and prevent the formation of 
“self”-double-stranded RNA. Since RNA-RNA inter- 
actions have a strong entropy-driven component, the 
need to minimize them should increase as temperature 
increases. Adenine could also help stabilizing mRNA 
against thermodegradation at high temperature, as it 
would stabilize the structure of mRNA (20). Avoid- 


ance of RNA thermodegradation may indeed repre- 
sent a major survival strategy in hyperthermophiles 
(36). A recent, exhaustive study on purine loading 
reported that mRNAs corresponding to highly ex- 
pressed genes were especially enriched in adenine in 
hyperthermophiles (93). Furthermore, the study re- 
ported a strong bias in favor of polypurine tracts, 
with runs of more than five As in the mRNAs of hy- 
perthermophiles. This correlates with the observed glu- 
tamate + lysine/(glutamine + histidine) ratio (110) 
and could explain why the number of charged residues 
in hyperthermophiles is much greater than would be 
necessary for the stabilization by ion pairs (7). In con- 
clusion, life at high temperature appears to involve a 
complex web of different mechanisms acting at the 
genome level. 


INSIGHTS INTO THE HISTORY 
OF THE ARCHAEA FROM 
COMPARATIVE GENOMICS 


From the time of their “discovery,” archaea have 
been at the center of heated debates about the evolu- 
tion of early life. Despite wide expectations, compara- 
tive archaeal genomics has not been very helpful in set- 
tling debates concerning the topology of the universal 
tree, and the root of the tree. Proponents of various 
contradictory hypotheses often used to select from 
genomic data what apparently helped to support their 
favorite scenario. The eucaryal character of archaeal 
informational proteins was used to confirm the sister- 
hood of archaea and eucarya, previously inferred from 
the rooting of the tree of life by universal couples of 
anciently duplicated paralogs (116). However, this in- 
terpretation can be misleading, since it cannot be de- 
termined from comparative genomics alone whether 
the features common to archaea and eucarya are prim- 
itive or derived traits. Comparative genomics can only 
provide evidence of characteristics shared between do- 
mains, but it does not help polarizing these characters. 
(For example, a trait uniquely shared between archaea 
and eucarya [AE trait] may have originated in the 
branch leading to their common ancestor, and thus be 
considered to be derived. However, it is also equally 
likely that the AE trait is ancestral and was lost in the 
branch leading to bacteria.) The bacterial rooting of 
the universal tree remains controversial, and some ge- 
nomewide analyses even support the rooting of the 
tree of life in the eucaryal branch (18), as it was sug- 
gested from a careful analysis of the universal couple 
of paralogs composed of the signal recognition particle 
and its receptor (11). 

Despite the eucaryal affinity of their core genes, 
the general features (such as operon structure) and 
mechanisms of evolution of archaeal genomes (such as 
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replication-driver rearrangements) are very similar to 
those of bacteria. If these features have a common an- 
cestry, this indicates that they were already present in 
a common ancestor of archaea and bacteria. However, 
some of these features could also be due to conver- 
gent evolution because of similar lifestyles, or else, 
they could reflect a common mechanism at the origin 
of the archaeal and bacterial domains. For instance, 
it has been suggested that both archaeal and bacterial 
genomes may have been derived from larger genomes 
by streamlining, with traits being retained in extant 
eucarya (96). Streamlining may have been triggered by 
adaptation to thermophilic environments (36). 

The sequencing of archaeal genomes has been ex- 
tremely fruitful for unraveling the evolutionary history 
of the Archaea. Crenarchaeal genomes do not contain 
more “eucarya-like genes” than euryarchaeal genes. 
This argues against the “eocyte theory,” which postu- 
lated a specific phylogenetic affinity between eucarya 
and Crenarchaeota and implied that archaea are not 
monophyletic (56). Archaeal comparative genomics 
has confirmed the importance of the Euryarchaeota- 
Crenarchaeota divide. Members of these two archaeal 
kingdoms exhibit striking differences in their DNA 
replication and cell division machineries. Crucial cell 
division proteins, such as MinD and FtsZ, appear to 
be exclusive to Euryarchaeota and Nanoarchaeota 
and systematically absent from all current crenar- 
chaeal genomes. The same is true for DNA polymerase 
of the D family and for histones, whose functions are 
likely performed by nonhomologous proteins in Cre- 
narchaeota (44, 114). On the other hand, some of 
these differences between Euryarchaeota and Cren- 
archaeota may have to be reevaluated after the se- 
quencing of marine Crenarchaeota; archaeal histone 
genes were recently discovered in several contigs of 
uncultured marine Crenarchaeota and in Cenarch- 
aeum symbiosum (28), although it cannot be excluded 
at present that they may have been acquired by HGT 
from Euryarchaeota. The present sequencing of a 
member of the tentative phylum, Korarchaeota, should 
provide especially important information on the early 
diversification in the Archaea. 

The genomic era has led to an explosion of “whole 
genome trees,” where different characteristics, such 
as the number of shared genes, shared protein folds, 
or even metabolic pathways, replace nucleotide or 
amino acid substitutions to quantify evolutionary dis- 
tances. A variant is supertree methods, which com- 
bine phylogenetic trees obtained from different genes 
(for a review see reference 30). All genome trees clus- 
ter all archaea together as a coherent group, distinct 
from bacteria and eucarya, thereby confirming the 
three-domain concept. Except for a universal su- 
pertree (29), all of them failed to recover the Eur- 


yarchaeota-Crenarchaeota divide. Moreover, depend- 
ing on the method of reconstruction, the whole- 
genome trees usually place Haloarchaea and/or Ther- 
moplasma (two Euryarchaeota) either at the base of 
the Archaea or as sister groups to Crenarchaeota. This 
indicates that archaeal whole-genome trees are strongly 
affected by the extensive HGT that occurred within 
archaea, as well as between archaea and bacteria. In 
fact, the different archaeal genomes have not been af- 
fected to the same extent by HGT, and this may ex- 
plain why only some archaea are clearly misplaced. 
In particular, the genomes of Haloarchaea and Ther- 
moplasmatales contain a high number of genes of bac- 
terial origin, explaining why they are attracted toward 
the Bacteria in whole-genome trees and, as a conse- 
quence, emerge at the base of the Archaea. For Ther- 
moplasmatales, biased placement is increased by the 
presence of many genes in their genomes that appear 
to have been recruited from thermoacidophilic Cre- 
narchaeota (Sulfolobales). 

Whole-genome trees based on shared character- 
istics are therefore not reliable for obtaining a good 
phylogeny of the Archaea. Presently, the best meth- 
ods are those that focus on selected sets of proteins 
that are less prone to HGT over large evolutionary 
distances, such as ribosomal proteins and RNA poly- 
merase subunits. Several precautions have to be con- 
sidered to generate a “good” archaeal tree. In partic- 
ular, outgroup sequences (eucaryal and/or bacterial) 
that are normally used to root archaeal trees should 
be excluded from analysis to avoid possible long- 
branch attraction artifacts. Furthermore, individual 
trees should be analyzed carefully before concatena- 
tion to detect evident cases of HGT and remove the 
corresponding genes from the dataset. Following this 
strategy, the construction of datasets of concatenated 
ribosomal proteins and RNA polymerase subunits 
that are common to all archaea and have not been in- 
volved in HGT, and the accurate construction of the 
corresponding “translation” and “transcription” trees, 
has allowed a few questions to be solved that had re- 
mained confusing in 16 rRNA trees (12). 

The translation and transcription trees based on 
the analysis of 23 complete or draft genomes (Fig. 8, 
A and B, respectively) are largely congruent, with one 
exception (see below), confirming the existence of a 
core of proteins that can be used to reconstruct the 
history of the Archaea domain (12). The results ob- 
tained have validated most of the evolutionary rela- 
tionships previously obtained from 16S rRNA se- 
quence comparison. In particular, they have confirmed 
that Halobacteriales are sister groups of Methanomi- 
crobiales, and that Thermoplasmatales belong to a 
large clade comprising Halobacteriales, Methanomi- 
crobiales, and Archaeoglobales. The only difference 
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Figure 8. Unrooted maximum likelihood (ML) trees based on concatenation of 53 ribosomal proteins (A) and 12 RNA poly- 
merase subunits and two transcription factors (B). Numbers at nodes are bootstrap values (BVs). The scale bars represent the 
number of changes per position for a unit branch length. Trees were produced by exhaustive searches performed by PROTML 
(1). Branch lengths and likelihood values were calculated by TREE-PUZZLE (JJT model including a (-correction [8 categories 
of sites]) (106). Numbers at nodes are bootstrap values computed with PUZZLEBOOT (51) from 1,000 replications. Aster- 
isks indicate constrained nodes (supported by BV = 100% in preliminary NJ and heuristic ML analyses). The names of groups 
showing incongruent positions between the two trees are in bold. (Adapted from reference 12 with permission.) 


between the translation and transcription trees is the 
position of M. kandleri. M. kandleri branches with 
Methanobacteriales in the translation tree (Fig. 8A), 
whereas it is located at the base of Euryarchaeota in 
the transcription tree (Fig. 8B), as it is in the 16S 
rRNA tree. However, the position of M. kandleri in 
the transcription tree is most likely an artifact of long- 
branch attraction induced by the rapid evolution of 
its RNA polymerase (see “Mechanisms of genome 
evolution” above). This interpretation is apparent 
from the very long branch displayed by M. kandleri in 
the transcription tree, compared with the translation 
tree (Fig. 8) (12). 

For some organisms, the congruence of tran- 
scription and translation trees is not sufficient to re- 
construct the history of a particular lineage. This is 
illustrated by N. equitans, which branches indepen- 
dently from Euryarchaeota and Crenarchaeota in 


both trees, although it may be a distant relative of 
Thermococcales. Indeed, a specific phylogenetic prox- 
imity of N. equitans and Thermococcales is indicated 
by several single-gene phylogenies (elongation factors, 
reverse gyrase, DNA topoisomerase VI, tyrosyl-tRNA 
synthetase) and by the concatenation of only the pro- 
teins of the ribosome small subunit (14). The abnor- 
mal branching of N. equitans in the rRNA, transcrip- 
tion and translation trees is probably due to a mixture 
of long-branch attraction artifact (due to the rapid 
rate of evolution of its proteins, under pressure for 
reductive evolution) and HGT from the Ignicoccus 
host (especially for the proteins of the large subunit). 
In conclusion, the analysis of the components of two 
independent molecular systems, and the critical eval- 
uation of the position of M. kandleri and N. equitans, 
allow a reasonable view of archaeal history to be pro- 
posed (depicted in Fig. 9). 
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Figure 9. An ideal tree of the Archaea based on recent phylogenetic analyses of large concatenated protein data sets. 


PERSPECTIVE: THE NEXT FIVE YEARS 


Nearly ten years after the first archaeal genome 
was sequenced, a good knowledge of the structure and 
evolution of archaeal chromosomes has been ob- 
tained. Nevertheless, the gap between the number of 
complete genome sequences for archaea versus bacte- 
ria remains important, and this bias will probably con- 
tinue because archaea are not considered a threat for 
human health. However, this may change in the near 
future, with the development of metagenomics pro- 
jects aimed at identifying the complete flora of human 
commensals. Moreover, archaea may reveal them- 
selves to be important players in human illness (21). 

Ongoing metagenomics projects will continue to 
broaden the understanding of archaeal diversity and 
evolution. The cultivation of species that have only 
been detected in sequences of environmental samples 
will be a key step for understanding the features of 


the organisms that occupy critical positions in ar- 
chaeal phylogeny; these include the deep-branching 
Korarchaeota, and the Crenarchaeota thriving in cold 
environments. This will help to gain a better under- 
standing of the nature of the archaeal ancestor itself, 
and the origin of the major differences between the 
two major archaeal phyla, the Euryarchaeota and the 
Crenarchaeota. 

Several features remain to be determined about 
archaeal chromosome structure and evolution. For 
instance, not much is known about termination of 
DNA replication and their associated sequence sites. 
Increasing the number of complete archaeal genomes 
will permit a thorough comparative analysis that will 
surely help clarify this issue in the near future. Com- 
parative genomics will also allow a more complete 
investigation of the nature and extent of HGT in the 
Archaea and will possibly provide clues on the mech- 
anisms responsible for the passage of genetic infor- 
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ma 


tion between archaeal and bacterial species. Since 


archaeal viruses may be key players in this process, 
ongoing studies will shortly shed light onto this issue 


by 


providing a better understanding of their diver- 


sity, biology, and survival strategies. 


Acknowledgments. We thank the editor for accurate revision of the 
manuscript and precious comments, Karsten Shure and Jean- 
Michel Claverie for kindly providing Figure 7, Celine Brochier for 
kindly providing Figure 9, and Diego Cortez for kindly providing 
Figure 2 and for artwork on Figures 1 and 4. 


ra 


10. 


11. 


12. 


REFERENCES 


. Adachi, J., and M. Hasegawa. 1996. MOLPHY version 2.3: 
programs for molecular phylogenetics based on maximum 
likelihood. Comput. Sci. Monogr. 28:1-150. 

. Andersson, J. O., S. W. Sarchfield, and A. J. Roger. 2005. 
Gene transfers from nanoarchaeota to an ancestor of 
diplomonads and parabasalids. Mol. Biol. Evol. 22:85-90. 

. Aravind, L., and E. V. Koonin. 1999. DNA-binding proteins 
and evolution of transcription regulation in the archaea. 
Nucleic Acids Res. 27:4658-4670. 

. Atomi, H., R. Matsumi, and T. Imanaka. 2004. Reverse gy- 
rase is not a prerequisite for hyperthermophilic life. J. Bacte- 
riol. 186:4829-4833. 

. Basak, S., and T. C. Ghosh. 2005. On the origin of genomic 
adaptation at high temperature for prokaryotic organisms. 
Biochem. Biophys. Res. Commun. 330:629-632. 

. Bentley, S. D., K. F. Chater, A. M. Cerdeno-Tarraga, G. L. 
Challis, N. R. Thomson, K. D. James, D. E. Harris, M. A. 
Quail, H. Kieser, D. Harper, A. Bateman, S. Brown, G. Chan- 
dra, C. W. Chen, M. Collins, A. Cronin, A. Fraser, A. Goble, 
J. Hidalgo, T. Hornsby, S. Howarth, C. H. Huang, T. Kieser, 
L. Larke, L. Murphy, K. Oliver, S. O’Neil, E. Rabbinowitsch, 
M. A. Rajandream, K. Rutherford, S. Rutter, K. Seeger, 
D. Saunders, S. Sharp, R. Squares, S. Squares, K. Taylor, 
T. Warren, A. Wietzorrek, J. Woodward, B. G. Barrell, 
J. Parkhill, and D. A. Hopwood. 2002. Complete genome se- 
quence of the model actinomycete Streptomyces coelicolor 
A3(2). Nature 417:141-147. 

. Berezovsky, I. N., and E. I. Shakhnovich. 2005. Physics and 
evolution of thermophilic adaptation. Proc. Natl. Acad. Sci. 
USA 102:12742-12747. 

. Berquist, B. R., and S. DasSarma. 2003. An archaeal chro- 
mosomal autonomously replicating sequence element from an 
extreme halophile, Halobacterium sp. strain NRC-1. J. Bac- 
teriol. 185:5959-5966. 

. Blattner, F. R., G. Plunkett IN, C. A. Bloch, N. T. Perna, 

V. Burland, M. Riley, J. Collado-Vides, J. D. Glasner, C. K. 

Rode, G. F. Mayhew, J. Gregor, N. W. Davis, H. A. Kirk- 

patrick, M. A. Goeden, D. J. Rose, B. Mau, and Y. Shao. 

1997. The complete genome sequence of Escherichia coli 

K-12. Science 277:1453-1474. 

Brewer, B. J. 1988. When polymerases collide: replication and 

the transcriptional organization of the E. coli chromosome. 

Cell 53:679-686. 

Brinkmann, H., and H. Philippe. 1999. Archaea sister group 

of Bacteria? Indications from tree reconstruction artifacts in 

ancient phylogenies. Mol. Biol. Evol. 16:817-825. 

Brochier, C., P. Forterre, and S. Gribaldo. 2005. An emerg- 

ing phylogenetic core of Archaea: phylogenies of transcription 

and translation machineries converge following addition of 

new genome sequences. BMC Evol. Biol. 5:36. 


13. 


14. 


15; 


16. 


17. 


18. 


19. 


20. 


2A; 


22. 


23. 


24. 


25. 


26. 


27. 


28. 


29. 


Brochier, C., P. Forterre, and S. Gribaldo. 2004. Archaeal 
phylogeny based on proteins of the transcription and trans- 
lation machineries: tackling the Methanopyrus kandleri para- 
dox. Genome Biol. 5:R17. 

Brochier, C., S. Gribaldo, Y. Zivanovic, F. Confalonieri, and 
P. Forterre. 2005. Nanoarchaea: representatives of a novel ar- 
chaeal phylum or a fast-evolving euryarchaeal lineage related 
to Thermococcales? Genome Biol. 6:R42. 

Brügger, K., P. Redder, Q. She, F. Confalonieri, Y. Zivanovic, 
and R. A. Garrett. 2002. Mobile elements in archaeal genomes. 
FEMS Microbiol. Lett. 206:131-141. 

Brugger, K., E. Torarinsson, P. Redder, L. Chen, and R. A. 
Garrett. 2004. Shuffling of Sulfolobus genomes by au- 
tonomous and non-autonomous mobile elements. Biochem. 
Soc. Trans. 32:179-183. 

Bult, C. J., O. White, G. J. Olsen, L. Zhou, R. D. Fleis- 
chmann, G. G. Sutton, J. A. Blake, L. M. FitzGerald, R. A. 
Clayton, J. D. Gocayne, A. R. Kerlavage, B. A. Dougherty, 
J. F. Tomb, M. D. Adams, C. I. Reich, R. Overbeek, E. F. 
Kirkness, K. G. Weinstock, J. M. Merrick, A. Glodek, J. L. 
Scott, N. S. M. Geoghagen, and J. C. Venter. 1996. Complete 
genome sequence of the methanogenic archaeon, Methano- 
coccus jannaschii. Science 273:1058-1073. 
Caetano-Anolles, G., and D. Caetano-Anolles. 2005. Univer- 
sal sharing patterns in proteomes and evolution of protein 
fold architecture and life. J. Mol. Evol. 60:484-498. 
Cambillau, C., and J. M. Claverie. 2000. Structural and ge- 
nomic correlates of hyperthermostability. J. Biol. Chem. 275: 
32383-32386. 

Cate, J. H., A. R. Gooding, E. Podell, K. Zhou, B. L. Golden, 
A. A. Szewczak, C. E. Kundrot, T. R. Cech, and J. A. Doudna. 
1996. RNA tertiary structure mediation by adenosine plat- 
forms. Science 273:1696-1699. 

Cavicchioli, R., P. M. Curmi, N. Saunders, and T. Thomas. 
2003. Pathogenic archaea: do they exist? Bioessays 25:1119- 
1128. 

Charbonnier, F., and P. Forterre. 1994. Comparison of plas- 
mid DNA topology among mesophilic and thermophilic eu- 
bacteria and archaebacteria. J. Bacteriol. 176:1251-1259. 
Chen, L., K. Brugger, M. Skovgaard, P. Redder, Q. She, 
E. Torarinsson, B. Greve, M. Awayez, A. Zibat, H. P. Klenk, 
and R. A. Garrett. 2005. The genome of Sulfolobus acidocal- 
darius, a model organism of the Crenarchaeota. J. Bacteriol. 
187:4992—4999. 

Chinen, A., I. Uchiyama, and I. Kobayashi. 2000. Compari- 
son between Pyrococcus horikoshii and Pyrococcus abyssi 
genome sequences reveals linkage of restriction-modification 
genes with large genome polymorphisms. Gene 259:109-121. 
Ciaramella, M., A. Napoli, and M. Rossi. 2005. Another ex- 
treme genome: how to live at pH 0. Trends Microbiol. 13:49-51. 
Cohen, G. N., V. Barbe, D. Flament, M. Galperin, R. Heilig, 
O. Lecompte, O. Poch, D. Prieur, J. Querellou, R. Ripp, J. C. 
Thierry, J. Van der Oost, J. Weissenbach, Y. Zivanovic, and 
P. Forterre. 2003. An integrated analysis of the genome of the 
hyperthermophilic archaeon Pyrococcus abyssi. Mol. Micro- 
biol. 47:1495-1512. 

Constantinesco, F., P. Forterre, E. V. Koonin, L. Aravind, and 
C. Elie. 2004. A bipolar DNA helicase gene, herA, clusters 
with rad50, mre11 and nurA genes in thermophilic archaea. 
Nucleic Acids Res. 32:1439-1447. 

Cubonova, L., K. Sandman, S. J. Hallam, E. F. Delong, and 
J. N. Reeve. 2005. Histones in crenarchaea. J. Bacteriol. 
187:5482-5485. 

Daubin, V., M. Gouy, and G. Perriere. 2002. A phylogenomic 
approach to bacterial phylogeny: evidence of a core of genes 
sharing a common history. Genome Res. 12:1080-1090. 


CHAPTER 19 


STRUCTURE AND EVOLUTION OF GENOMES 431 


30. 


32. 


40. 


41. 


42. 


43. 


44. 


Delsuc, F., H. Brinkmann, and H. Philippe. 2005. Phyloge- 
nomics and the reconstruction of the tree of life. Nat. Rev. 
Genet. 6:361-375. 


. Deppenmeier, U., A. Johann, T. Hartsch, R. Merkl, R. A. 


Schmitz, R. Martinez-Arias, A. Henne, A. Wiezer, S. Baumer, 
C. Jacobi, H. Bruggemann, T. Lienard, A. Christmann, 
M. Bomeke, S. Steckel, A. Bhattacharyya, A. Lykidis, R. Over- 
beek, H. P. Klenk, R. P. Gunsalus, H. J. Fritz, and G. Gotts- 
chalk. 2002. The genome of Methanosarcina mazei: evidence 
for lateral gene transfer between bacteria and archaea. J. Mol. 
Microbiol. Biotechnol. 4:453-461. 

Diruggiero, J., D. Dunn, D. L. Maeder, R. Holley-Shanks, 
J. Chatard, R. Horlacher, F. T. Robb, W. Boos, and R. B. 
Weiss. 2000. Evidence of recent lateral gene transfer among 
hyperthermophilic archaea. Mol. Microbiol. 38:684-693. 


. Fitz-Gibbon, S. T., H. Ladner, U. J. Kim, K. O. Stetter, M. I. 


Simon, and J. H. Miller. 2002. Genome sequence of the hy- 
perthermophilic crenarchaeon Pyrobaculum aerophilum. 
Proc. Natl. Acad. Sci. USA 99:984-989. 


. Forterre, P. 2002. A hot story from comparative genomics: 


reverse gyrase is the only hyperthermophile-specific protein. 
Trends Genet. 18:236-237. 


. Forterre, P. 1997. Archaea: what can we learn from their 


sequences? Curr. Opin. Genet. Dev. 7:764—670. 


. Forterre, P. 1995. Thermoreduction, a hypothesis for the ori- 


gin of prokaryotes. C. R. Acad. Sci. Ser. III 318:415-422. 


. Forterre, P., C. Bouthier De La Tour, H. Philippe, and 


M. Duguet. 2000. Reverse gyrase from hyperthermophiles: 
probable transfer of a thermoadaptation trait from archaea to 
bacteria. Trends Genet. 16:152-154. 


. Frank, A. C., and J. R. Lobry. 1999. Asymmetric substitution 


patterns: a review of possible underlying mutational or selec- 
tive mechanisms. Gene 238:65-77. 


. Fukui, T., H. Atomi, T. Kanai, R. Matsumi, S. Fujiwara, and 


T. Imanaka. 2005. Complete genome sequence of the hyper- 
thermophilic archaeon Thermococcus kodakaraensis KOD1 
and comparison with Pyrococcus genomes. Genome Res. 
15:352-363. 

Futterer, O., A. Angelov, H. Liesegang, G. Gottschalk, 
C. Schleper, B. Schepers, C. Dock, G. Antranikian, and 
W. Liebl. 2004. Genome sequence of Picrophilus torridus and 
its implications for life around pH 0. Proc. Natl. Acad. Sci. 
USA 101:9091-9096. 

Gadelle, D., J. Filee, C. Buhler, and P. Forterre. 2003. Phy- 
logenomics of type II DNA topoisomerases. Bioessays 25: 
232-242. 

Galagan, J. E., C. Nusbaum, A. Roy, M. G. Endrizzi, P. Mac- 
donald, W. FitzHugh, S. Calvo, R. Engels, S. Smirnov, D. At- 
noor, A. Brown, N. Allen, J. Naylor, N. Stange-Thomann, 
K. DeArellano, R. Johnson, L. Linton, P. McEwan, K. Mc- 
Kernan, J. Talamas, A. Tirrell, W. Ye, A. Zimmer, R. D. Barber, 
I. Cann, D. E. Graham, D. A. Grahame, A. M. Guss, R. Hed- 
derich, C. Ingram-Smith, H. C. Kuettner, J. A. Krzycki, J. A. 
Leigh, W. Li, J. Liu, B. Mukhopadhyay, J. N. Reeve, K. Smith, 
T. A. Springer, L. A. Umayam, O. White, R. H. White, 
E. Conway de Macario, J. G. Ferry, K. F. Jarrell, H. Jing, A. J. 
Macario, I. Paulsen, M. Pritchett, K. R. Sowers, R. V. Swanson, 
S. H. Zinder, E. Lander, W. W. Metcalf, and B. Birren. 2002. 
The genome of M. acetivorans reveals extensive metabolic and 
physiological diversity. Genome Res. 12:532-542. 

Galtier, N., and J. R. Lobry. 1997. Relationships between ge- 
nomic G+C content, RNA secondary structures, and optimal 
growth temperature in prokaryotes. J. Mol. Evol. 44:632-636. 
Grabowski, B., and Z. Kelman. 2003. Archeal DNA replica- 
tion: eukaryal proteins in a bacterial context. Annu. Rev. Mi- 
crobiol. 57:487-516. 


45. 


46 


47. 


48. 


49. 


50. 


51. 


52. 


53: 


54. 


J3; 


56. 


57. 


58. 


59; 


Guipaud, O., E. Marguet, K. M. Noll, C. B. de la Tour, and 
P. Forterre. 1997. Both DNA gyrase and reverse gyrase are 
present in the hyperthermophilic bacterium Thermotoga mar- 
itima. Proc. Natl. Acad. Sci. USA 94:10606-10611. 


. Haney, P. J., J. H. Badger, G. L. Buldak, C. I. Reich, C. R. 


Woese, and G. J. Olsen. 1999. Thermal adaptation analyzed 
by comparison of protein sequences from mesophilic and ex- 
tremely thermophilic Methanococcus species. Proc. Natl. 
Acad. Sci. USA 96:3578-3583. 

Haring, M., X. Peng, K. Brugger, R. Rachel, K. O. Stetter, R. A. 
Garrett, and D. Prangishvili. 2004. Morphology and genome 
organization of the virus PSV of the hyperthermophilic ar- 
chaeal genera Pyrobaculum and Thermoproteus: a novel virus 
family, the Globuloviridae. Virology 323:233-242. 
Hendrickson, E. L., R. Kaul, Y. Zhou, D. Bovee, P. Chapman, 
J. Chung, E. Conway de Macario, J. A. Dodsworth, 
W. Gillett, D. E. Graham, M. Hackett, A. K. Haydock, 
A. Kang, M. L. Land, R. Levy, T. J. Lie, T. A. Major, B. C. 
Moore, I. Porat, A. Palmeiri, G. Rouse, C. Saenphimmachak, 
D. Soll, S. Van Dien, T. Wang, W. B. Whitman, Q. Xia, 
Y. Zhang, F. W. Larimer, M. V. Olson, and J. A. Leigh. 2004. 
Complete genome sequence of the genetically tractable hy- 
drogenotrophic methanogen Methanococcus maripaludis. 
J. Bacteriol. 186:6956-6969. 

Herzel, H., O. Weiss, and E. N. Trifonov. 1998. Sequence 
periodicity in complete genomes of archaea suggests positive 
supercoiling. J. Biomol. Struct. Dyn. 16:341-345. 

Hickey, D. A., and G. A. Singer. 2004. Genomic and pro- 
teomic adaptations to growth at high temperature. Genome 
Biol. 5:117. 

Holder, M. E., and A. J. Roger. 2002. A shell-script program 
called “puzzleboot” that allows the analysis of multiple data 
sets with PUZZLE even though PUZZLE lacks the “M” op- 
tion of many PHYLIP programs. http://hades.biochem.dal.ca/ 
Rogerlab/Software/software.html#puzzleboot. 

Jansen, R., J. D. van Embden, W. Gaastra, and L. M. Schouls. 
2002. Identification of a novel family of sequence repeats 
among prokaryotes. Omics 6:23-33. 

Kampmann, M., and D. Stock. 2004. Reverse gyrase has 
heat-protective DNA chaperone activity independent of su- 
percoiling. Nucleic Acids Res. 32:3537-3545. 

Kennedy, S. P., W. V. Ng, S. L. Salzberg, L. Hood, and S. Das- 
Sarma. 2001. Understanding the adaptation of Halobac- 
terium species NRC-1 to its extreme environment through 
computational analysis of its genome sequence. Genome Res. 
11:1641-1650. 

Koonin, E. V., Y. I. Wolf, and L. Aravind. 2001. Prediction 
of the archaeal exosome and its connections with the pro- 
teasome and the translation and transcription machineries 
by a comparative-genomic approach. Genome Res. 11:240- 
252. 

Lake, J. A., E. Henderson, M. Oakes, and M. W. Clark. 1984. 
Eocytes: a new ribosome structure indicates a kingdom with a 
close relationship to eukaryotes. Proc. Natl. Acad. Sci. USA 
81:3786-3790. 

Lambros, R. J., J. R. Mortimer, and D. R. Forsdyke. 2003. Op- 
timum growth temperature and the base composition of open 
reading frames in prokaryotes. Extremophiles 7:443-450. 
Lao, P. J., and D. R. Forsdyke. 2000. Thermophilic bacteria 
strictly obey Szybalski’s transcription direction rule and po- 
litely purine-load RNAs with both adenine and guanine. 
Genome Res. 10:228-236. 

Lecompte, O., R. Ripp, J. C. Thierry, D. Moras, and O. Poch. 
2002. Comparative analysis of ribosomal proteins in com- 
plete genomes: an example of reductive evolution at the do- 
main scale. Nucleic Acids Res. 30:5382-5390. 


432 


FORTERRE ET AL. 


60. 


61. 


62. 


63. 


64. 


65. 


66. 


67. 


68. 


69. 


70 


vale 


72. 


73; 


74 


75. 


76. 


77. 


78. 


79. 


Lewis, P. J. 2001. Bacterial chromosome segregation. Micro- 
biology 147:519-526. 

Lobry, J. R. 1996. Asymmetric substitution patterns in the 
two DNA strands of bacteria. Mol. Biol. Evol. 13:660-605. 
Londei, P. 2005. Evolution of translational initiation: new in- 
sights from the archaea. FEMS Microbiol. Rev. 29:185-200. 
Lopez, P., H. Philippe, H. Myllykallio, and P. Forterre. 1999. 
Identification of putative chromosomal origins of replication 
in Archaea. Mol. Microbiol. 32:883-886. 

Lopez-Garcia, P., C. Brochier, D. Moreira, and F. Rodriguez- 
Valera. 2004. Comparative analysis of a genome fragment of 
an uncultivated mesopelagic crenarchaeote reveals multiple 
horizontal gene transfers. Environ. Microbiol. 6:19-34. 
Lopez-Garcia, P., P. Forterre, J. van der Oost, and G. Erauso. 
2000. Plasmid pGSS from the hyperthermophilic archaeon 
Archaeoglobus profundus is negatively supercoiled. J. Bacte- 
riol. 182:4998-5000. 

Lundgren, M., A. Andersson, L. Chen, P. Nilsson, and 
R. Bernander. 2004. Three replication origins in Sulfolobus 
species: synchronous initiation of chromosome replication 
and asynchronous termination. Proc. Natl. Acad. Sci. USA 
101:7046-7051. 

Luscombe, N. M., D. Greenbaum, and M. Gerstein. 2001. 
What is bioinformatics? A proposed definition and overview 
of the field. Methods Inf. Med. 40:346-358. 

Lynn, D. J., G. A. Singer, and D. A. Hickey. 2002. Synony- 
mous codon usage is subject to selection in thermophilic bac- 
teria. Nucleic Acids Res. 30:4272-4277. 

Maeder, D. L., R. B. Weiss, D. M. Dunn, J. L. Cherry, J. M. 
Gonzalez, J. DiRuggiero, and F. T. Robb. 1999. Divergence of 
the hyperthermophilic archaea Pyrococcus furiosus and P. 
horikoshii inferred from complete genomic sequences. Ge- 
netics 152:1299-1305. 


. Maisnier-Patin, S., L. Malandrin, N. K. Birkeland, and 


R. Bernander. 2002. Chromosome replication patterns in the 
hyperthermophilic euryarchaea Archaeoglobus fulgidus and 
Methanocaldococcus (Methanococcus) jannaschii. Mol. Mi- 
crobiol. 45:1443-1450. 

Makarova, K. S., L. Aravind, M. Y. Galperin, N. V. Grishin, 
R. L. Tatusov, Y. I. Wolf, and E. V. Koonin. 1999. Compara- 
tive genomics of the Archaea (Euryarchaeota): evolution of 
conserved protein families, the stable core, and the variable 
shell. Genome Res. 9:608-628. 

Makarova, K. S., L. Aravind, N. V. Grishin, I. B. Rogozin, 
and E. V. Koonin. 2002. A DNA repair system specific for 
thermophilic Archaea and bacteria predicted by genomic con- 
text analysis. Nucleic Acids Res. 30:482-496. 

Makarova, K. S., and E. V. Koonin. 2003. Comparative ge- 
nomics of Archaea: how much have we learned in six years, 
and what’s next? Genome Biol. 4:115. 


. Makarova, K. S., and E. V. Koonin. 2005. Evolutionary and 


functional genomics of the Archaea. Curr. Opin. Microbiol. 
117:52-67. 

Makarova, K. S., Y. I. Wolf, and E. V. Koonin. 2003. Poten- 
tial genomic determinants of hyperthermophily. Trends 
Genet. 19:172-176. 

Makino, S., and M. Suzuki. 2001. Bacterial genomic reorga- 
nization upon DNA replication. Science 292:803. 

Marguet, E., and P. Forterre. 1994. DNA stability at temper- 
atures typical for hyperthermophiles. Nucleic Acids Res. 22: 
1681-1686. 

Marguet, E., and P. Forterre. 1998. Protection of DNA by 
salts against thermodegradation at temperatures typical for 
hyperthermophiles. Extremophiles 2:115-122. 

Marguet, E., and P. Forterre. 2001. Stability and manipula- 
tion of DNA at extreme temperatures. Methods Enzymol. 
334:205-115. 


80. 


81. 


82. 


83. 


84. 


85. 


86. 


87. 


88. 


89. 


90. 


91); 


92: 


93. 


94. 


Matsunaga, F., P. Forterre, Y. Ishino, and H. Myllykallio. 
2001. In vivo interactions of archaeal Cdc6/Orc1 and mini- 
chromosome maintenance proteins with the replication origin. 
Proc. Natl. Acad. Sci. USA 98:11152-11157. 
Matte-Tailliez, O., Y. Zivanovic, and P. Forterre. 2000. Min- 
ing archaeal proteomes for eukaryotic proteins with novel 
functions: the PACE case. Trends Genet. 16:533-536. 
McLean, M. J., K. H. Wolfe, and K. M. Devine. 1998. Base 
composition skews, replication orientation, and gene orien- 
tation in 12 prokaryote genomes. J. Mol. Evol. 47:691-696. 
Mojica, F. J., C. Diez-Villasenor, J. Garcia-Martinez, and 
E. Soria. 2005. Intervening sequences of regularly spaced 
prokaryotic repeats derive from foreign genetic elements. 
J. Mol. Evol. 60:174-182. 

Mojica, F. J., C. Diez-Villasenor, E. Soria, and G. Juez. 2000. 
Biological significance of a family of regularly spaced repeats 
in the genomes of Archaea, Bacteria and mitochondria. Mol. 
Microbiol. 36:244-6. 

Mojica, F. J., C. Ferrer, G. Juez, and F. Rodriguez-Valera. 
1995. Long stretches of short tandem repeats are present in 
the largest replicons of the Archaea Haloferax mediterranei 
and Haloferax volcanii and could be involved in replicon par- 
titioning. Mol. Microbiol. 17:85-93. 

Mrazek, J., and S. Karlin. 1998. Strand compositional asym- 
metry in bacterial and large viral genomes. Proc. Natl. Acad. 
Sci. USA 95:3720-3725. 

Myllykallio, H., P. Lopez, P. Lopez-Garcia, R. Heilig, 
W. Saurin, Y. Zivanovic, H. Philippe, and P. Forterre. 2000. 
Bacterial mode of replication with eukaryotic-like machinery 
in a hyperthermophilic archaeon. Science 288:2212-2215. 
Napoli, A., A. Valenti, V. Salerno, M. Nadal, F. Garnier, 
M. Rossi, and M. Ciaramella. 2004. Reverse gyrase recruit- 
ment to DNA after UV light irradiation in Sulfolobus solfa- 
taricus. J. Biol. Chem. 279:33192-33198. 

Nelson, K. E., R. A. Clayton, S. R. Gill, M. L. Gwinn, R. J. 
Dodson, D. H. Haft, E. K. Hickey, J. D. Peterson, W. C. Nel- 
son, K. A. Ketchum, L. McDonald, T. R. Utterback, J. A. 
Malek, K. D. Linher, M. M. Garrett, A. M. Stewart, M. D. 
Cotton, M. S. Pratt, C. A. Phillips, D. Richardson, J. Heidel- 
berg, G. G. Sutton, R. D. Fleischmann, J. A. Eisen, and C. M. 
Fraser. 1999. Evidence for lateral gene transfer between Ar- 
chaea and bacteria from genome sequence of Thermotoga 
maritima. Nature 399:323-329. 

Ng, W. V., S. P. Kennedy, G. G. Mahairas, B. Berquist, 
M. Pan, H. D. Shukla, S. R. Lasky, N. S. Baliga, V. Thors- 
son, J. Sbrogna, S. Swartzell, D. Weir, J. Hall, T. A. Dahl, 
R. Welti, Y. A. Goo, B. Leithauser, K. Keller, R. Cruz, M. J. 
Danson, D. W. Hough, D. G. Maddocks, P. E. Jablonski, 
M. P. Krebs, C. M. Angevine, H. Dale, T. A. Isenbarger, R. F. 
Peck, M. Pohlschroder, J. L. Spudich, K. H. Jung, M. Alam, 
T. Freitas, S. Hou, C. J. Daniels, P. P. Dennis, A. D. Omer, 
H. Ebhardt, T. M. Lowe, P. Liang, M. Riley, L. Hood, and 
S. DasSarma. 2000. Genome sequence of Halobacterium 
species NRC-1. Proc. Natl. Acad. Sci. USA. 97:12176- 
12181. 

Olsen, G. J., and C. R. Woese. 1997. Archaeal genomics: an 
overview. Cell 89:991-994., 

Opalka, N., M. Chlenov, P. Chacon, W. J. Rice, W. Wriggers, 
and S. A. Darst. 2003. Structure and function of the tran- 
scription elongation factor GreB bound to bacterial RNA 
polymerase. Cell 114:335-345. 

Paz, A., D. Mester, I. Baca, E. Nevo, and A. Korol. 2004. 
Adaptive role of increased frequency of polypurine tracts in 
mRNA sequences of thermophilic prokaryotes. Proc. Natl. 
Acad. Sci. USA 101:2951-2956. 

Peng, X., K. Brugger, B. Shen, L. Chen, Q. She, and R. A. 
Garrett. 2003. Genus-specific protein binding to the large 


CHAPTER 19 


STRUCTURE AND EVOLUTION OF GENOMES 433 


95. 


96. 


97. 


98. 


99: 


100. 


101. 


102. 


103. 


104. 


105. 


106. 


107. 


clusters of DNA repeats (short regularly spaced repeats) pre- 
sent in Sulfolobus genomes. J. Bacteriol. 185:2410-2417. 
Pietrokovski, S. 2001. Intein spread and extinction in evolu- 
tion. Trends Genet. 17:465-472. 

Poole, A., D. Jeffares, and D. Penny. 1999. Early evolution: 
prokaryotes, the new kids on the block. Bioessays 21:880-889. 
Pourcel, C., G. Salvignol, and G. Vergnaud. 2005. CRISPR el- 
ements in Yersinia pestis acquire new repeats by preferential 
uptake of bacteriophage DNA, and provide additional tools 
for evolutionary studies. Microbiology 151:653-663. 
Querol, E., J. A. Perez-Pons, and A. Mozo-Villarias. 1996. 
Analysis of protein conformational characteristics related to 
thermostability. Protein Eng. 9:265-271. 

Robinson, N. P., I. Dionne, M. Lundgren, V. L. Marsh, 
R. Bernander, and S. D. Bell. 2004. Identification of two ori- 
gins of replication in the single chromosome of the archaeon 
Sulfolobus solfataricus. Cell 116:25-38. 

Rocha, E. P. 2004. Order and disorder in bacterial genomes. 
Curr. Opin. Microbiol. 7:519-527. 

Rocha, E. P. 2004. The replication-related organization of 
bacterial genomes. Microbiology 150:1609-1627. 

Rocha, E. P., and A. Danchin. 2003. Essentiality, not expres- 
siveness, drives gene-strand bias in bacteria. Nat. Genet. 
34:377-378. 

Rocha, E. P., and A. Danchin. 2001. Ongoing evolution of 
strand composition in bacterial genomes. Mol. Biol. Evol. 
18:1789-1799. 

Saunders, N. F., T. Thomas, P. M. Curmi, J. S. Mattick, 
E. Kuczek, R. Slade, J. Davis, P. D. Franzmann, D. Boone, 
K. Rusterholtz, R. Feldman, C. Gates, S. Bench, K. Sowers, 
K. Kadner, A. Aerts, P. Dehal, C. Detter, T. Glavina, S. Lu- 
cas, P. Richardson, F. Larimer, L. Hauser, M. Land, and 
R. Cavicchioli. 2003. Mechanisms of thermal adaptation re- 
vealed from the genomes of the Antarctic Archaea Methano- 
genium frigidum and Methanococcoides burtonii. Genome 
Res. 13:1580-1588. 

Schieg, P., and H. Herzel. 2004. Periodicities of 10-11bp as in- 
dicators of the supercoiled state of genomic DNA. J. Mol. 
Biol. 343:891-901. 

Schmidt, H. A., K. Strimmer, M. Vingron, and A. von Hae- 
seler. 2002. TREE-PUZZLE: maximum likelihood phyloge- 
netic analysis using quartets and parallel computing. Bio- 
informatics 18:502-504. 

She, Q., R. K. Singh, F. Confalonieri, Y. Zivanovic, G. Allard, 
M. J. Awayez, C. C. Chan-Weiher, I. G. Clausen, B. A. Cur- 
tis, A. De Moors, G. Erauso, C. Fletcher, P. M. Gordon, 
I. Heikamp-de Jong, A. C. Jeffries, C. J. Kozera, N. Medina, 
X. Peng, H. P. Thi-Ngoc, P. Redder, M. E. Schenk, C. Theri- 
ault, N. Tolstrup, R. L. Charlebois, W. F. Doolittle, 
M. Duguet, T. Gaasterland, R. A. Garrett, M. A. Ragan, C. W. 


108. 


109; 


110. 


Iti: 


112. 


113. 


114. 


115, 


116. 


117. 


118. 


119. 


120. 


Sensen, and J. Van der Oost. 2001. The complete genome of 
the crenarchaeon Sulfolobus solfataricus P2. Proc. Natl. Acad. 
Sci. USA 98:7835-7840. 

Singer, G. A., and D. A. Hickey. 2003. Thermophilic prokary- 
otes have characteristic patterns of codon usage, amino acid 
composition and nucleotide content. Gene 317:39-47. 
Slesarev, A. I., K. V. Mezhevaya, K. S. Makarova, N. N. Po- 
lushin, O. V. Shcherbinina, V. V. Shakhova, G. I. Belova, 
L. Aravind, D. A. Natale, I. B. Rogozin, R. L. Tatusov, Y. I. 
Wolf, K. O. Stetter, A. G. Malykh, E. V. Koonin, and S. A. 
Kozyavkin. 2002. The complete genome of hyperthermophile 
Methanopyrus kandleri AV19 and monophyly of archaeal 
methanogens. Proc. Natl. Acad. Sci. USA 99:4644—4649. 
Shure, K., and J. M. Claverie. 2003. Genomic correlates of 
hyperthermostability, an update. J. Biol. Chem. 278:17198- 
17202. 

Tillier, E. R., and R. A. Collins. 2000. Genome rearrange- 
ment by replication-directed translocation. Nat. Genet. 26: 
195-197. 

Vieille, C., and G. J. Zeikus. 2001. Hyperthermophilic en- 
zymes: sources, uses, and molecular mechanisms for ther- 
mostability. Microbiol. Mol. Biol. Rev. 65:1-43. 

Waters, E., M. J. Hohn, I. Ahel, D. E. Graham, M. D. Adams, 
M. Barnstead, K. Y. Beeson, L. Bibbs, R. Bolanos, M. Keller, 
K. Kretz, X. Lin, E. Mathur, J. Ni, M. Podar, T. Richardson, 
G. G. Sutton, M. Simon, D. Soll, K. O. Stetter, J. M. Short, 
and M. Noordewier. 2003. The genome of Nanoarchaeum 
equitans: insights into early archaeal evolution and derived 
parasitism. Proc. Natl. Acad. Sci. USA 100:12984-12988. 
White, M. F., and S. D. Bell. 2002. Holding it together: chro- 
matin in the Archaea. Trends Genet. 18:621-626. 

Wiezer, A., and R. Merkl. 2005. A comparative categoriza- 
tion of gene flux in diverse microbial species. Genomics 
86:462-475. 

Woese, C. R., O. Kandler, and M. L. Wheelis. 1990. Towards 
a natural system of organisms: proposal for the domains 
Archaea, Bacteria, and Eucarya. Proc. Natl. Acad. Sci. USA 
87:4576-4579. 

Zhang, J. 1999. Performance of likelihood ratio tests of evo- 
lutionary hypotheses under inadequate substitution models. 
Mol. Biol. Evol. 16:868-875. 

Zhang, R., and C. T. Zhang. 2005. Identification of replica- 
tion origins in archaeal genomes based on the Z-curve 
method. Archaea 1:335-346. 

Zhang, R., and C. T. Zhang. 2004. Identification of replica- 
tion origins in the genome of the methanogenic archaeon, 
Methanocaldococcus jannaschii. Extremophiles 8:253-258. 
Zivanovic, Y., P. Lopez, H. Philippe, and P. Forterre. 2002. 
Pyrococcus genome comparison evidences chromosome shuf- 
fling-driven evolution. Nucleic Acids Res. 30:1902-1910. 


Archaea: Molecular and Cellular Biology 
Edited by Ricardo Cavicchioli 
© 2007 ASM Press, Washington, D.C. 


Chapter 20 


Functional Genomics 


FRANCIS E. JENNEY, JR., SABRINA TACHDJIAN, CHUNG-JUNG CHOU, 
ROBERT M. KELLY, AND MICHAEL W. W. ADAMS 


INTRODUCTION 


The substantial amount of genome sequence in- 
formation available for members of the Archaea facil- 
itates the use of functional genomics tools to investigate 
issues related to genetics, metabolism, physiology, and 
ecology of these microorganisms. Indeed, over 20 ge- 
nome sequences have been completed for archaea (see 
Chapter 19). In some cases, genome sequences for 
more than one member of a genus are available, e.g., 
Pyrococcus (39, 94, 146) and Sulfolobus (34, 95, 162). 
Given the phylogenetic placement of the domain Ar- 
chaea relative to the Bacteria and Eucarya, use of func- 
tional genomics to probe the relationship between 
genotype and phenotype for these fascinating mi- 
croorganisms can provide a unique perspective on 
both evolutionary processes and mechanisms in ex- 
tant biological systems. As genetic systems for archaea 
become further developed and implemented (see Chap- 
ter 21), functional genomics tools can be expanded 
to enable a full systems biology approach to study- 
ing archaea. 

The current status of functional genomics efforts 
to investigate archaea are reviewed in this chapter. In 
this discussion, functional genomics refers to tran- 
scriptional response (transcriptomics), protein inven- 
tory and differential abundance (proteomics), and 
protein structural attributes (structural genomics) ex- 
amined in the context of entire genomes. Ultimately, 
the goal is to integrate information gained from these 
three perspectives to discern biological mechanisms, 
although most efforts to date have focused on devel- 
oping and implementing effective protocols for each 
methodology for specific archaea. However, significant 
progress has been made through the use of functional 
genomics tools in understanding the biology of ar- 


chaea, examples of which are presented. In addition, 
specific challenges and opportunities that relate to the 
use of functional genomics to study archaeal biology 
are considered. 


Choice of Model Organisms 


The choice of model archaea for functional ge- 
nomics studies arises to a large extent from the pre- 
vailing interests within the scientific community. As 
has been the case in other areas of biology, model sys- 
tems for functional genomics studies will emerge 
from long-term efforts focused on the biology of spe- 
cific archaea. Such efforts have generated information 
related to specific biomolecules, metabolic pathways, 
regulatory processes, and physiological patterns that 
can be investigated in a more complete way through 
functional genomics. Furthermore, model archaea are 
obviously those for which complete genome sequence 
information is available, although this may soon be 
a moot point when bacterial and archaeal genomes 
are sequenced within hours (116). Without genome 
sequence data, the use of functional genomics tools, 
though not impossible, is problematic. Model archaea 
should be readily cultivated in laboratory settings for 
a range of environmental and nutritional conditions; 
this is paramount to gaining functional genomics 
information on regulation, physiology, and ecology. 
The choice of model archaea for functional genomics 
studies may also depend on the availability of useful 
genetic systems. Despite the challenges in developing 
such systems for anaerobic and extremophilic ar- 
chaea, much progress has been made along these lines 
(see Chapter 21). On the other hand, functional ge- 
nomics tools can be used to circumvent some of the 
limitations. The combination of both modern and clas- 
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sical tools for studying biological processes is the best 
situation for systems biology approaches. 

Currently, certain archaea are emerging as model 
systems for functional genomics studies. Table 1 lists 
several possible model archaea chosen for the reasons 
discussed above. Within the hyperthermophilic ar- 
chaea, Pyrococcus furiosus is among the most exten- 
sively studied; indeed, over 600 PubMed literature ci- 
tations exist for this hyperthermophile at the time of 
writing (September 2005). P. furiosus was isolated 
from geothermal waters near Vulcano Island, Italy 
(58). It grows anaerobically near 100°C on a number 
of glucans in the presence or absence of elemental sul- 
fur (52) but will grow on peptides only if sulfur is pre- 
sent in the medium. Primary metabolic products are 
acetate, Hj, and H,S (when sulfur is present). Not 
only has the P. furiosus genome been sequenced (146), 
but genomes for two other members of the genus are 
also available (39, 94). Although no genetic system is 
currently available for P. furiosus, efforts have been 
made in this respect for members of the genus Pyro- 
coccus (134). P. furiosus can be cultured in large-scale 
batch (181) and continuous culture (141) to relatively 
high biomass concentrations for a wide variety of 
growth conditions, thus facilitating functional ge- 
nomics efforts. P. furiosus has been used in laboratory 
studies for almost twenty years, is relatively easy to 
cultivate, and has a versatile growth physiology. As a 
result, a large number of proteins from P. furiosus have 
been characterized with respect to microbial biochem- 
istry and biotechnology (2-4). In light of the these fac- 
tors, several functional genomics efforts have focused 
on P. furiosus, including transcriptomics, proteomics, 
and structural genomics, as summarized in Table 2. 


Sulfolobus solfataricus and P. furiosus represent 
the two most-studied hyperthermophilic archaea; 
over 400 PubMed citations exist for S. solfataricus. 
Isolated from acidic thermal hot spring near Naples, 
Italy, in 1980 (199), S. solfataricus has been exten- 
sively studied, for example, to understand the basis 
for protein thermostability (158), adaptation to low 
pH environments, biotechnology (190), and features 
of eucaryal genetics from the perspective of the ar- 
chaea (50). S. solfataricus grows fastest aerobically 
at pH 3.5 and 80°C ona range of glucans and pep- 
tides as carbon sources (199). Its utilization of ele- 
mental sulfur (S°) during growth has been questioned 
despite initial reports of this capability at the time of 
isolation. The S. solfataricus genome has been se- 
quenced (162), as have those of S. tokodaii (95) and 
S. acidocaldarius (34). Given its aerobic lifestyle and 
the large number of mobile genetic elements in its 
genome, a genetic system for S. solfataricus has been 
demonstrated (41). As the results from functional ge- 
nomics are obtained, the availability of a genetic sys- 
tem will be invaluable for further studying the func- 
tions of specific genes in this archaeon. 

Thermococcus kodakaraensis was isolated in 1994 
from a shallow marine vent off the coast of Japan 
(13). Similar in many ways to P. furiosus in terms of 
its growth physiology, albeit growing optimally at 
85°C rather than near 100°C, the biochemistry of 
T. kodakaraensis has been extensively studied using 
recombinant versions of its proteins produced with 
information derived from its recently reported genome 
sequence (61). Although functional genomics efforts 
have yet to be reported for this archaeon, the recent 
development of a versatile genetic system is very ex- 


Table 1. Possible model archaeal systems for functional genomics studies 


Parameter Pyrococcus Sulfolobus Halobacterium Thermococcus Methanocaldococcus 
furiosus solfataricus P2 NRC-1 kodakaraensis jannaschii 

Literature citations >600 ~400 HBS: ~85 ~160 
Isolation Vulcano Island, Pisciarelli Solfatara, Salt mine, Bad Ischl, | Kodakara, White Smoker Chimney, 

Italy (1986) Italy (1980) Austria (1994) Japan (1994) East Pacific Rise (1985) 
Tope for growth 100°C 80°C aS 85°C 85°C 
Physiology Fermentative Heterotrophic Extreme halophile Fermentative Autotroph 

heterotroph thermoacidophile heterotroph 
Oxygen requirement Anaerobe Aerobe Facultative aerobe Anaerobe Anaerobe 
Easily plated No Yes Yes No No 
Year genome was 2001 2001 2000 2005 1996 

sequenced 
Genome size (Mb) 1.91 2:99 2.57 2.09 1.74 
Protein-coding ORFs 2,125 2,977 2,075 2,306 1,786 
Chromosomal elements 1 1 1 1 3 
Plasmids 0 0 2 0 0 
Genetic system No Yes Yes Yes Some tools available 
Closely related species P. horikoshii S. tokodaii H. salinarum P. horikoshii Related methanogens 
P. abyssi S. acidocaldarius H. volcanii P. abyssi are listed in Table 2. 


T. kodakaraensis 


P. furiosus 


Table 2. Archaeal “omics”: proteomic, transcriptomic, and structural genomic studies 


Proteome’ Transcriptome Structural 
genome 
Organism Single (S) or No. of Separation, No. of 
multiple (M) proteins identification Reference Growth conditions Reference recombinant 
growth conditions identified technique protein structures“ 
Methanocaldococcus jannaschii M (H3) 10 2D, MT 123 65/85/95°C 22 119, 19 
M (Hp, NH,*) 2 2D, MT 66 
S 170 2D, MT and LC-MS/MS 65 
S 954 LC, LC-MS/MS 198 
S 74 1D/LC, ESI/Q-FTMS 60 
Methanococcus maripaludis - - - - - - 18,2 
Methanococcoides burtonii M (4/23°C) 54 2D, LC-MS/MS 69 - - - 
$ 528 LC, LC-MS/MS 71, 151 
M (4/23°C) 163 ICAT/LC, LC-MS/MS 70 
Methanosarcina acetivorans M (methanol/acetate) 412 2D, MT 107, 108 - — — 
Methanosarcina thermophila M (methanol/acetate) 7 2D, comigration 87 - - = 
M (methanol/acetate) 6 2D, N-term 49 
Methanosarcina mazei - - - — Methanol/acetate 82 37,8 
Methanothermobacter - - - - - - >212, 36¢ 
thermautotrophicus 
Halobacterium salinarum S (cytoplasm) >800 2D, MT 173 - - Pes, 
S (membrane) 114 1D, LC-MS/MS 102 
M (salinity) >50 2D, ESI-Q TOF-MS/MS 132 
M (salinity) 29 2D, ESI-Q TOF-MS/MS 37 
M (hydrocarbons) 5 2D, ESI-Q TOF-MS/MS 92 
Haloferax volcanii Peptides/glucose 193 None 
Halobacterium NRC1 M (membrane mutants) 272 ICAT, wLC-ESI-MS/MS 14 M (membrane mutants) 14 None 
S (membrane, cytoplasm) 426 p.LC-ESI-MS/MS 68 UV radiation 14 8,0 
DMSO/TMAO 124 
Pyrococcus furiosus S (membrane, cytoplasm) 66 2D, MT and pLC-ESI-MS/MS 80 Sulfur (partial array) 157 425, 39° 
90/105°C (partial array) 165 
S 62 2D, MT and pLC-ESI-MS/MS 109 Peptides/maltose 156 
95/72°C 188 
Pyrococcus horikoshii - - - - - - 116, 87 
Thermoplasma acidophilum S 1 2D, N-term 189 - - 74, 17 
Archaeoglobus fulgidus - - - - 78/89°C 148 122, 31 
Sulfolobus solfataricus - - - - - - 52,9 
Aeropyrum pernix = = = = = = 39,45 


*The proteome list does not include studies where patterns were presented but individual proteins were not identified (e.g., 46, 121). The abbreviations are: 2D, 2D-gel electrophoresis; MT, MALDI-TOF. 
’The recombinant proteins and structures are those made available from structural genomics projects that target the individual organisms. 
‘Data are derived from the Protein Data Bank (http://targetdb.pdb.org/) representing data voluntarily deposited by structural genomics centers worldwide. These data do not include individual research projects outside 
of the structural genomics groups. Recombinant proteins refers to purified proteins (not simply targets for expression). Structures refer to both X-ray and NMR-based structures, and not all structures counted have 


been released. 


4For more information, see reference 154. 
For more information, see reference 185. 
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citing and will pave the way for comprehensive sys- 
tems biology studies (12). 

Several methanogens have been studied to vari- 
ous extents using functional genomics tools and, thus, 
represent prospective model systems. Among these, 
members of the Methanococcales (179), such as Me- 
thanocaldococcus jannaschii, whose sequence was re- 
leased in 1996 (73), have already been the object of 
several proteomic (60, 65, 198) and transcriptional 
response studies (22). In addition, the genome of Me- 
thanococcus maripaludis was recently sequenced (78) 
and the availability of mutagenesis tools for this 
methanogen will facilitate functional genomics stud- 
ies (122). The Methanosarcinales have been the fo- 
cus of genomics studies; no less than three complete 
genome sequences (47, 63) and one shotgun sequence 
(42, 152) have become available over the last few 
years. A genetic system has been developed for the 
genus Methanosarcina (119), and additional species- 
specific tools have been reported for Methanosarcina 
acetivorans (9, 135, 195, 196), Methanosarcina mazei 
(54), and Methanosarcina thermophila (49, 104, 
120). Furthermore, transcriptional data generated by 
DNA microarray experiments were recently reported 
for M. mazei (82). The proteomes of M. acetivorans 
(60, 107, 108), Methanococcoides burtonii (69-71, 
151) and M. thermophila (49) have also been exam- 
ined. In addition, members of the Methanobacteri- 
ales, whose flagship organism Methanobacterium 
thermautotrophicum (now Methanothermobacter 
thermautotrophicus) was sequenced in 1997 (166), 
are currently the object of structural proteomics ini- 
tiatives (38, 192). 

Last, but not least, are the halophilic archaea, 
some members of which were the first archaeal model 
systems for “extreme” environments. Halobacterium 
NRC1 is one of three halophiles that have been stud- 
ied to a significant extent with respect to physiology, 
metabolism, and genetics (68, 96, 124, 126, 186), 
H. halobium (salinarum) (19, 36, 102, 169, 173) and 
Haloferax volcanii being the others (7, 8, 18, 93, 
169). Any or all of these can be viewed as a model 
system for functional genomic analyses, with pio- 
neering studies in Halobacterium NRC1 having al- 
ready been reported (68, 92). 


Merits and Challenges of Functional 
Genomics Approaches 


The new field of systems biology involves the ra- 
tionalization of the behavior of an organism in terms 
of the interplay of all of its genes and the proteins that 
they encode. This approach has been a driving force 
for and has greatly facilitated genomewide and multi- 
genome studies, thereby generating unprecedented 


amounts of data that need to be sifted through for spe- 
cific insights. Initial efforts with transcriptomics and 
proteomics indicated that, while the potential to know 
how the influence of specific stimuli on every gene 
and protein associated with an organism’s genome 
could be monitored, such information may or may 
not be readily useful. This realization was sobering 
given the need for highly sophisticated, analytical, 
and statistical skills, as well as the significant expense, 
that are typically part and parcel of functional ge- 
nomics approaches. Thus, the substantial promise of- 
fered by functional genomics can be tempered at pre- 
sent by the investment in terms of time and money 
required and the actual outcome in terms of useful bi- 
ological information. 

Table 2 summarizes current activities with respect 
to the use of “omics” to study archaea. Not unlike 
other areas of biological research, “omics” tools have 
been used to various extents and with varying effec- 
tiveness. In some cases, studies have shown that certain 
tools have been especially useful in exploring biologi- 
cal phenomena. In other cases, beyond the demonstra- 
tion that a specific “omics” tool can be applied to a 
particular archaeon, not much new and useful infor- 
mation has resulted. It remains to be seen how these 
new tools impact the study of archaea and how to be 
of best use to a systems biology perspective. 

As functional genomics tools become more avail- 
able and are implemented for the study of archaea, it 
will be important to make strategic use of the tech- 
nologies. Initial efforts have often been “look and 
see,” with the expectation that genomewide responses 
to environmental perturbations will lead to interesting 
insights. While this has been true in some instances, 
hypothesis-driven efforts will be increasingly impor- 
tant so that the end result of functional genomics stud- 
ies provides definitive answers to biological questions. 


PROTEOMICS 


The term “proteomics” is used herein to describe 
the experimental determination of protein species 
within a given cell on a genomewide basis. Such ap- 
proaches obviously include the identification of pro- 
teins and can also include quantitation relative to 
other proteins in the cell or quantitation relative to 
that observed when the cell is grown under different 
conditions. Purely computational analyses of the pro- 
teins encoded by a genome are not considered, unless 
coupled with experimental investigation. The pro- 
teomic analyses of archaeal species have followed the 
methodologies that have been developed for bacteria 
and eucarya in general. In fact, some of the earlier 
pioneering studies using two-dimensional gel electro- 
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phoresis (2-DGE) were carried out with the halophilic 
archaea (46) and the methanogenic archaeon M. jan- 
naschii (123), where changes in protein patterns were 
evident in response to high temperature and pressure, 
respectively. Spot identification subsequently became 
routine, initially by N-terminal Edman sequencing, 
and then by mass spectrometry (MS), typically of 
tryptic peptides. The effiency of identifying proteins 
in 2-DGE spots has benefited from advances in de- 
velopments in analytical liquid chromatography (LC) 
as well as in MS, with archaea providing some samples 
for method’s development (109). The 2-DGE approach 
has several severe limitations, due to the nature of the 
separation medium and the staining methods. It is 
slowly being replaced in the more well-studied bacte- 
rial systems by direct high-throughput LC of complex 
mixtures coupled to more sophisticated MS applica- 
tions (see, for example, references 84, 111, 197). Such 
advances are also being utilized by the research com- 
munity in general, including those working with ar- 
chaea, and some examples are included herein. The 
next frontier is environmental proteomics; archaea 
have been involved, although they have not yet played 
a dominant role (for example, reference 143). As is 
the case with many facets of archaeal biology, the 
field of proteomics can be divided into accomplish- 
ments achieved with three different groups of archaea: 
methanogens, halophiles, and organisms that fit into 
neither category. However, this will no doubt change 
significantly in the near future, as the proteomes of 
more and more novel organisms are examined by in- 
creasingly rapid, high-throughput (HTP) techniques. 
Nevertheless, these categories are currently conve- 
nient and will be used here. The reader is also referred 
for additional information to an excellent summary 
of the proteomics techniques that have been applied 
to archaea (32). 


Proteomics of Methanogens 
M. jannaschii 


M. jannaschii was isolated from a deep-sea hydro- 
thermal vent and is regarded as a hyperthermophile, 
with an optimum growth temperature near 90°C. Its 
genome sequence was published in 1996, the fourth 
genome to be sequenced (27) and the first from an ar- 
chaeon. It is, perhaps, not surprising that M. jan- 
naschii has been the subject of many genome-related 
analyses (73). These include a seminal proteomics 
study (123) using 2-DGE to monitor changes in the 
cellular protein content when cells of this strict hy- 
drogenotroph were grown under high (178 kPa) and 
ultralow (650 Pa) partial pressures of hydrogen. 
While numerous changes were seen in the 2-DGE pat- 


terns of the two growth conditions, attention was fo- 
cused on representatives of each: five from low pH3- 
and one from high pH,-grown cells. Low pH, caused 
increased levels of F420-dependent methylenetetrahy- 
dromethanopterin dehydrogenase (MTD) and four 
flagellar proteins, while high pH, favored higher levels 
of Hy-dependent methylenetetrahydromethanopterin 
dehydrogenase (HMDX). The response of the two en- 
zymes was anticipated, but regulation of expression 
of flagella by hydrogen had not been seen in any do- 
main of life before this study. This was also the first 
example of the regulation of flagella synthesis in the 
Archaea and demonstrated that M. jannaschii does 
exhibit a chemotactic response. 

A subsequent proteomic analysis by 2-DGE of 
M. jannaschii also focused on two flagellar proteins 
B1 and B2 (encoded by MJ0891 and MJ0892, respec- 
tively). Their cellular concentrations changed in re- 
sponse to H, concentration, availability of ammonia, 
and the stage of cell growth (66). Multiple protein 
spots were observed after separation by 2-DGE that 
were not observed in the earlier study (123). These 
yielded peptides that corresponded to these two pro- 
teins, even though the spots were at different isoelec- 
tric points and mass values. Moreover, the abundance 
of the spots varied with the growth conditions. Given 
the extreme stability of proteins from organisms like 
M. jannaschii that thrive at temperatures close to 
90°C, it is also possible that multiple protein spots af- 
ter 2-DGE could arise from the incomplete dissocia- 
tion and denaturation of protein-protein complexes. 
Alternatively, or in addition, the flagellar proteins 
might be proteolytically cleaved or have varying de- 
grees of glycosylation (88), phosphorylation (see be- 
low), or deamidation (66). It was concluded that the 
flagellar proteins are structurally modified in some 
fashion in response to the cellular environment and 
growth stage, although this complex regulatory re- 
sponse is, as yet, not understood. 

The same authors subsequently reported a more 
complete proteome analysis of M. jannaschii using 
2-DGE and matrix-assisted laser desorption ioniza- 
tion-time of flight mass spectrometry (MALDI-TOF- 
MS) and liquid chromatography coupled to tandem 
mass spectrometry (LC/MS/MS) to analyze tryptic di- 
gests (65). In an effort to separate and identify as 
many proteins as possible, changes were made to the 
typical separation approach, including the use of two 
different protein stains and the use of two different 
parameters to resolve proteins in the first dimension 
(nonequilibrium pH electrophoresis as well as iso- 
electric focusing). A total of 170 proteins of the 1,738 
proteins annotated in the genome were identified af- 
ter the analyses of extracts of cells grown under “opti- 
mal” conditions. Almost a third of them were anno- 
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tated as (conserved) hypothetical proteins, demon- 
strating the power of the approach in showing that 
previously uncharacterized proteins were produced 
within the cell. Several proteins behaved anomalously 
during 2-DGE, indicating that not all protein com- 
plexes are dissociated by the sample preparation pro- 
cedure. For example, the three subunits (aBy) of 
methyl coenzyme M reductase isoenzyme I and one 
subunit of isoenzyme II were all identified in the same 
spot on the 2-DGE. In addition, some proteins were 
detected in more than one spot, suggesting differences 
in degrees of phosphorylation in the case of an elon- 
gation factor (MJ0822), or in glycosylation in the 
case of an S-layer protein (MJ0324). It was also ap- 
parent that glycosylation might interfere with protein 
staining. All in all, this study nicely demonstrated 
that, even for relatively small genomes such as that 
of M. jannaschii, which at 1.7 Mb (27) is close to a 
third that of Escherichia coli (4.6 Mb [20]), there was 
far more to rationalizing proteomic analyses than 
merely identifying proteins. 

The most recent report on the proteome of 
M. jannaschii employed a non-gel method of protein 
separation (198). This so-called shotgun technique, 
which had been successfully applied to identify pro- 
teins in lower eucarya, involves digestion of proteins 
in a complex mixture without prior separation, and 
then their separation and identification using multi- 
dimensional chromatographic capillary columns and 
LC/LC/tandem MS. Hundreds of proteins can be iden- 
tified in the same complex mixture. For M. jannaschii, 
more than 13,000 peptides were identified and as- 
signed to 954 proteins, more than half the proteins 
encoded by the complete genome. They ranged in size 
from less than 10 to more then 100 kDa, and almost 
all identifications were based on more than two pep- 
tides. More than 40% were identified over 50% of 
their sequence, and these are assumed to represent the 
more abundant proteins in the cell. About 60% of the 
genes in the genome encode (conserved) hypothetical 
proteins, and about 27% of them were identified. A 
total of 32 of the 36 proteins involved in the conver- 
sion of CO, to methane were identified. Several steps 
are catalyzed by multiple enzymes, and their rela- 
tive abundance in the cells (grown under high H3 
partial pressure) indicated unique regulatory control 
of methane production. The proteome analysis also 
identified self-splicing intein peptides and new pro- 
teins created by splicing, once more demonstrating 
the complexity of even a “simple” genome. 

A more targeted, non-2-DGE approach with 
M. jannaschii was recently reported (60) that used 
a continuous elution denaturing one-dimensional 
gel electrophoresis approach, coupled to subsequent 
reverse-phase liquid chromatography and electro- 


spray ionization quadrupole mass spectrometry (ESI/ 
Q-FTMS) to fragment and identify intact proteins. 
For a ribosomal-enriched fraction, 24 of the 68 pro- 
teins predicted to be in the ribosome were identified. 
In addition, 50 proteins were identified from frac- 
tionated samples of cell-extracts, and four were found 
to have significant mass shifts due to methylation, 
acetylation, and/or incorrect start sites. Analysis of a 
histone-enriched fraction gave no evidence for post- 
translational modification of this DNA-binding pro- 
tein, in agreement with the absence of the relevant en- 
zymes from the M. jannaschii genome. However, the 
same rather surprising conclusion was reached by a 
similar analysis of histone proteins in Methanosarcina 
acetivorans, whose genome does contain acetyl trans- 
ferase enzymes predicted to modify histones (60). 


Methanococcoides burtonii 


Methanococcoides burtonii was isolated from 
permanently cold (2°C), methane-rich lake water in 
Antarctica, although in the laboratory it grows fastest 
(Topit) at 23°C. The organism uses C-1 compounds 
such as methylamines and methanol, but not formate, 
H, and COs, or acetate, as growth substrates. The 
draft genome of the organism indicates that the ge- 
nome is ~2.9 Mb in size and encodes 2,676 proteins 
(152). A proteomics approach to investigate protein 
levels in cells grown at low (4°C) and optimal tem- 
peratures was recently reported by Goodchild et al. 
(71). This study represented the first global analysis of 
proteins involved in cold adaptation (rather than ex- 
amining the cold shock response). The results were 
particularly of interest since this archaeon does not 
contain homologs of the well-characterized cold shock 
protein (Csp) family found in bacteria. Comparison 
of 2-DGE from the two cell types revealed 54 proteins 
that appeared to be differentially regulated, and these 
were identified by LC/ESI-MS/MS. Reverse-transcrip- 
tase PCR analyses were conducted for the expression 
of the genes encoding 33 of the 54 proteins, and for 
about half of them there was good agreement between 
the observed changes in protein and mRNA levels. Of 
course, an exact match is unlikely, given that the sta- 
bility of RNA and protein products will vary from 
gene to gene. There were also examples of posttrans- 
lationally modified proteins. One example was methyl 
coenzyme M reductase, whose three subunits were 
identified in eight separate spots. This made quantita- 
tion difficult, but an examination of mRNA levels 
showed that all three genes were up-regulated at the 
lower temperature by approximately 3-fold. 

The results showed that adaptation to tempera- 
tures significantly below the optimum involves tran- 
scription and RNA polymerase subunit E and protein 
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folding with peptidyl prolyl cis/trans isomerase. There 
was also evidence the optimum growth temperature is 
a stressful condition, and this required production of 
the heat shock protein DnaK. In a companion study the 
same authors identified approximately 20% of the 
proteins encoded in the genome (528 of 2676) by di- 
rect separation and analysis of trypsin-treated cell- 
free extracts by LC-MS/MS (2-DGE was not utilized). 
Of those 528 identified, about 25% were previously 
annotated as hypothetical proteins. The remainder 
were assigned a variety of anticipated physiological 
roles, including the complete set of proteins of the 
methanogenic pathway (see also reference 151). One 
unanticipated result was the identification of two 
transposases, indicating the potential for active genetic 
exchange at these cold temperatures. 

The proteome of M. burtonii has also been exam- 
ined (70) by a second generation MS-based technique 
using isotope labeling, the so-called isotope-coded 
affinity tag (ICAT) approach (74). This involves label- 
ing cysteine residues already present in proteins in 
complex mixtures with either a 1°C- or !%C-labeled 
reagent. The two protein mixtures are then mixed and 
are digested with trypsin. The peptides are purified by 
affinity chromatography and analyzed by MS/MS. The 
essence of this approach is that the relative abundance 
of the !7C and !3C isotopes allows accurate quantita- 
tion as well as the identification of the same protein 
in the two samples. In the case of M. burtonii, cells 
were grown at 4 and at 23°C, the same as were used 
for the 2-DGE study, and a total of 163 proteins were 
identified using the double-labeling approach (70). 
Labeling consistency of each sample was assessed 
by two independent labelings, and the relative abun- 
dance of the peptides that were identified was compa- 
rable. However, note that, although statistical analyses 
showed a high level of confidence in protein identifi- 
cation, only 48 of the 163 were identified from two or 
more peptides. A comparison of the results from this 
study with those from the 2-DGE showed that the two 
approaches are very complementary. Thus, the num- 
ber of proteins identified by the 2-DGE approach that 
were differentially expressed by at least 2-fold (16%) 
was 4-fold higher than in the ICAT study. This is as- 
sumed to be a result of the higher stringency used for 
the ICAT analysis, where the relative abundances must 
be consistent in all labeling experiments. Similarly, the 
2-DGE and ICAT studies each identified unique pro- 
teins (21 and 11, respectively), further emphasizing 
the need to carry out both techniques to optimize cov- 
erage. In general, the biological implications of the 
ICAT study confirmed and extended those made from 
the 2-DGE analysis, with a key regulatory role for pro- 
teins involved in methanogenesis and transcription, as 
well as hypothetical proteins of unknown function. 


Methanosarcina acetivorans 


One of the surprises of the genome sequencing 
projects was the large size of the genome of M. ace- 
tivorans. At 5.75 Mb, it is the largest yet reported for 
any member of the Archaea (63). Clearly, this organ- 
ism has a considerable metabolic capacity that is well 
beyond that of, for example, M. jannaschii, whose 
genome is less than one third the size. One striking 
factor is the large number of duplicated genes in the 
M. acetivorans genome; the role of these paralogs, and 
how their expression is regulated is largely unknown. 
A recent proteomic study sought to address this issue 
(107, 108). Using 2-DGE coupled with MS analysis 
(MALDI-TOF-TOF) after in-gel tryptic digestion, a 
total of 412 proteins were identified, 30% of which 
were annotated as encoding hypothetical or conserved 
hypothetical proteins. Many of the proteins were de- 
tected only in acetate- (122 proteins) or methanol- 
grown cells (102), rather than both (188). Many of 
them included paralogs involved in the methanogene- 
sis pathway. Some, such as the three operons encoding 
the methanol-utilizing methyltransferases, appeared to 
be differentially regulated, while others, such as the 
two operons encoding the heterodisulfide redutase, 
were not. Other genes apparently differentially regu- 
lated included those involved in ATP synthesis, protein 
synthesis, protein-folding, and stress responses. One 
of the most intriguing findings concerned the presence 
in acetate-grown cells of proteins that are part of a 
NADH-dependent, sodium-transporting membrane 
oxidoreductase. The results suggest that different 
methanogens may have different mechanisms for en- 
ergy conservation during methanogenesis. This pro- 
teomics study has given the first glimpse into the ex- 
tremely complex metabolism of a fascinating but poorly 
understoood methanogen. 


Methanosarcina thermophila 


Methanosarcina thermophila is a moderately 
thermophilic methanogen and was one of the first 
archaea to be studied using a proteomic approach. 
The protein content of acetate- and methanol-grown 
cells was examined using 2-DGE, and the study re- 
solved more than 400 protein spots from each growth 
condition, with more than 100 being exclusive to one 
or the other (87). Spots corresponding to three en- 
zymes (acetate kinase, phosphotransacetylase, and 
carbon monoxide dehydrogenase) were identified in 
acetate-grown cells using purified proteins as mark- 
ers, and these were either absent or were at a lower 
level in methanol-grown cells. These studies were ex- 
tended by Ding et al. (49) who identified the most 
abundant proteins in methanol- and acetate-grown 
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cells by N-terminal Edman sequencing and estimated 
the relative abundances of subunits of the methanol: 
coenzyme M methyltransferase (MTase) system. 
MTase I contains two subunits (MtaB and MtaC), 
while MTase II contains only one (MtaA), and there 
are two homologs of MtaA and three for both MtaB 
and MtaC. Some of these were shown to be at higher 
levels, or only detected in methanol-grown cells, and 
some only in acetate-grown cells. Global proteomic 
analyses have not been reported for this organism, 
and its genome sequence is not yet publicly available, 
in contrast to those of the related species M. mazei, 
M. acetivorans, and M. barkeri (47, 63; see www 
.genomesonline.org). 


Proteomics of Halophiles 


Extensive proteomic analyses have been carried 
out with the halophilic archaea (92). In a seminal pa- 
per, 2-DGE was used to analyze cell extracts of sev- 
eral Halobacterium species and demonstrated that 
they exhibit a response to heat shock (60°C for 3 min) 
by synthesizing a limited number of proteins of de- 
fined molecular weight range (46). Cells rapidly as- 
sumed their “normal” pattern of protein expression 
when returned to the ‘normal’ temperature (37°C). 
Some proteins that appeared in response to heat were 
also observed in gel patterns when cells were exposed 
to a change in salinity, suggesting that, like the other 
two domains of life, archaea, or at least the halophilic 
species, exhibited a general stress response. Specific 
changes in protein patterns on 2-DGE were subse- 
quently observed with Haloferax volcanii when it 
was grown at high and low salinities, although the 
identities of specific proteins were not determined 
(121). The problems caused by the need for high-salt 
concentrations to prevent aggregation and precipita- 
tion of proteins from halophiles have recently been 
addressed by using filtration methods in the case of 
Halobacterium salinarum (36) and various washing 
methods with H. volcanii (93), although in these stud- 
ies, proteins were not identified and differential ex- 
pression was not assessed. 

The first study reporting the large-scale identifi- 
cation of expressed proteins of a haloarchaeon from 
an identification perspective was for Halobacterium 
sp. NRC-1 (14). The genome sequence of this organ- 
ism encodes 2682 putative protein-coding genes (126). 
The ambitious proteome analysis was part of a global 
systems approach that also included transcriptional 
profiling and utilized the wild-type strain and three 
mutants that either overproduced or did not produce 
the light-harvesting, membrane-associated bacteri- 
orhodopsin. The proteomic approach involved direct 


analysis of differentially labeled cell-extracts using the 
ICAT technique as described above for M. burtonii. 
A total of 272 different proteins were identified from 
1,120 tryptic peptides, showing that 50 of the pro- 
teins were differentially produced in the wild-type 
and mutant strains. Note that the ICAT approach can 
detect quite minor changes (<50%) in relative pro- 
tein abundance (76). Only seven of the 50 regulated 
proteins were of unknown function, compared with 
41% predicted from the genome sequence. The rest 
of the proteins were associated with phototrophy, 
membrane and carotenoid biosynthesis, and pri- 
mary N metabolism (arginine, glutamate metabolism, 
and pyrimidine). Perhaps surprisingly, correspond- 
ing changes in the transcriptional levels were ob- 
served for only 17 of these 50 differentially regulated 
proteins. Nevertheless, it could be concluded that at 
the systems level, there was reciprocal regulation of 
phototrophy and arginine fermentation, which are 
the two major pathways of energy conservation in 
Halobacterium sp. 

The protein content of the insoluble membrane 
and soluble cytoplasmic fractions of cell extracts of 
Halobacterium NRC-1 have also been examined di- 
rectly using a simple shotgun approach. The tryptic 
digests of the two fractions were separated by micro- 
capillary HPLC, and peptides were identified by ESI- 
MS/MS. A total of 426 proteins were identified, with 
401 encoded in the chromosome and 25 in the mini- 
chromosomes of this organism. Of these, 232 were 
soluble, 165 insoluble, and 29 in both fractions. In 
this case, only one growth condition was used and 
regulatory effects were not assessed. 

A global proteomic analysis of H. salinarum 
(strain R1, DSM681) was recently reported (173). As 
most halophilic proteins are acidic (82% have pI 
values between 3.5 and 5.5), one of the goals was 
to resolve proteins that migrate over a narrow pH 
range. The use of overlapping 2D gels and subsequent 
MALDI-TOF MS peptide mass fingerprinting (PMF) 
led to the identification of over 800 proteins in the 
more than 1,800 spots resolved in the cytoplasmic 
fraction. A semiautomated protocol was utilized which 
included automated spot excision, 96-well plate-based 
in-gel digestions, and automated spectral acquisition 
and peak annotation generation. The genome sequence 
of the organism has not been released (www.genomes 
online.org) but is reported (173) to contain at least 
2784 open reading frames (ORFs) that encode pro- 
teins. Approximately 40% of the proteins encoded by 
these ORFs were therefore identified by 2-DGE analy- 
sis. It was estimated that the gels contained 1,600 
unique proteins, indicating that a high percentage of 
the number of predicted cytoplasmic proteins (a total 
of 1,959) are present in cells grown under “standard” 
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conditions. Almost 100 proteins were found in more 
than one spot on the 2D gel, indicating that a signifi- 
cant fraction may be posttranslationally modified, al- 
though some might be artifacts (deamidation, car- 
bamylation, etc.) of the separation procedure (173). 
The correlation between predicted and observed pro- 
tein migration on the 2D gels enabled reassignment 
of start codons for some proteins based on their ap- 
parently anomalous positions. The results showed 
two general biases in the types of proteins identified. 
First, proteins with a mass above 25 kDa were more 
readily identified than those smaller (50% versus 
20% of those predicted), and this is presumably be- 
cause of the smaller number of tryptic peptides from 
smaller proteins. Second, chromosomally encoded 
proteins were much more readily detected (48% of 
predicted) than those encoded by the four megaplas- 
mids (41 to 284 kb; 14% of predicted) in this organ- 
ism, suggesting that in general the latter are not sig- 
nificantly expressed under the growth conditions. 
Tebbe et al. (173) speculated that a higher percentage 
of H. salinarum proteins were not identified by the 
2-DGE analyses because of a lack of gene expression, 
loss during sample preparation, poor staining, and 
lack of identification by MALDI-TOF. 

The membrane proteomes of H. halobium and 
H. salinarum have also been analyzed. The problem 
with membrane proteins is to solubilize the proteins 
under conditions that allow separation while main- 
taining solubility. This was achieved with the mem- 
brane proteins of H. halobium by a rapid and simple 
technique involving solubilization with, and tryptic 
digestion in, aqueous methanol and separation and 
peptide identification using a micro-LC-MS/MS ap- 
proach (21). A total of 41 proteins were identified, 
80% of which were predicted to contain transmem- 
brane domains. These included all tryptic peptides 
of bacteriorhodopsin, even those posttranslationally 
modified. In an extension of the proteomics analysis of 
H. salinarum (173), an LC, rather than 2D approach, 
was used to analyze membrane proteins (102). Initial 
studies using various approaches showed that 2D 
analysis was not very productive because the mem- 
brane proteins precipitated at their pI and could not 
be solublized, and peptides that are obtained from re- 
solved spots are not suited to identification by MALDI- 
TOF analysis. Nevertheless, one-dimensional-sodium 
dodecyl sulfate (SDS) gel separation coupled with LC- 
MS/MS resulted in the identification of 114 proteins 
that were predicted to contain transmembrane a-helical 
regions, which is about 20% of the number estimated 
from the genome sequence. In combination with the 
study of the cytoplasmic proteins, a total of almost 
950 H. salinarum proteins were identified, represent- 
ing 34% of the predicted gene products. This archaeon 


is therefore one of the best characterized of all organ- 
isms by proteomic approaches. 

The use of proteomics to identify proteins of 
biotechnological relevance has recently been utilized 
in halophilic archaea. A pioneering study (132) ex- 
amined cell extracts of H. salinarum grown with 3.5, 
4.3, or 6.0 M NaCl by 2-DGE, and observed 773 
spots, 56 of which appeared to be differentially regu- 
lated. As the authors pointed out, protein identifica- 
tion using (MALDI-TOF) analysis of tryptic peptides 
is not efficient using halophilic organisms as their pro- 
teins contain a low proportion of the basic residues 
targeted by trypsin; thus, the more sophisticated elec- 
trospray ionization quadrupole (ESI-Q-) TOF-MS/MS 
approach was used to identify more than 50 proteins 
from spots. These included inosine monophosphate 
dehydrogenase, which was more abundant in cells 
grown at high-salt concentration. Because of the bio- 
technological application of such a salt-tolerant en- 
zyme, the recombinant form was obtained and fur- 
ther characterized. In a subsequent extension of this 
work, the same approach was used to identify 29 pro- 
teins from spots after 2-DGE that appeared to be dif- 
ferentially regulated after growing cells at different 
salinities (37). Of these, six were of unknown func- 
tion and, based on detailed bioinformatic analyses, 
recombinant expression and immunological pull-down 
assays, one of them was proposed to catalyze the acety- 
lation of a ribosomal protein (L13). This approach 
to identify halophilic enzymes of biotechnological sig- 
nificance was recently applied to H. salinarum grown 
in the presence of petroleum hydrocarbons (4% diesel). 
Five spots on 2-DGE were shown to vary dramati- 
cally after analysis of cell extracts from diesel-grown 
and control cells, although the proteins involved were 
not reported (92). 


Proteomics of Nonmethanogenic, 
Nonhalophilic Archaea 


Sulfolobus species 


A complete proteomic analysis of a Sulfolobus 
species has not been reported. However, some semi- 
nal proteomic work has been performed. In the first 
proteomics study (178), 2-DGE analysis was per- 
formed on Sulfolobus strain B12 and demonstrated 
that thermotolerance was accompanied by the syn- 
thesis of three heat shock proteins. This extremely 
thermophilic acidophile grows optimally at 70°C and 
pH 3.5 and acquired thermotolerance at 92°C after 
exposure at 88°C for one hour. A heat shock response 
was also observed in S. acidocaldarius by 2-DGE 
(130). The same approach was also used with this or- 
ganism to show phosphorylation of a specific protein 
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in response to phosphate starvation, and this led to the 
first proposal for a two-component regulatory system 
in the Archaea (130). The use of 2-DGE to identify 
phosphoproteins in Sulfolobus sp. was subsequently 
utilized by others (e.g., 112). Very recently, a double- 
labeling procedure involving S. solfataricus growth 
on '°N- and !°C-enriched media was reported. The 
approach improves the efficiency of protein identifica- 
tion as well as enabling accurate quantitation of pro- 
tein abundance, although the biology of the organism 
was not specifically addressed (167). 


Metallosphaera sedula 


The approach used to investigate the response 
to heat shock in Sulfolobus species has been applied 
to examine the proteome of another extreme thermo- 
acidophile, M. sedula (75). However, in this case, 
continuous culture was used to minimize growth 
phase and growth rate effects. M. sedula was grown 
in a 10-liter chemostat at 74°C, pH 2.0, at a dilution 
rate of 0.04 h~?, and was subjected to both abrupt 
and gradual temperature shifts. Growth in continu- 
ous culture for 100 h at the supraoptimal tempera- 
ture of 81°C (the stress condition) led to an approx- 
imately sevenfold lower steady-state cell density than 
that observed for growth at or below 79°C. SDS- 
polyacrylamide gel electrophoresis (PAGE) (both one 
and two dimensional) revealed significantly higher 
levels (sixfold increase) of a 66-kDa stress-response 
protein (MseHSP60), immunologically related to 
Thermophilic Factor 55 from Sulfolobus shibatae 
(178), which was later determined to be the archaeal 
thermosome. The thermosome was clearly the most 
abundant intracellular protein apparent by 2D SDS- 
PAGE. If the culture that had been acclimated to 
81°C was returned to a lower temperature (74°C), 
the amount of thermosome reverted to a level ob- 
served prior to thermal acclimation. Furthermore, 
when the previously acclimated culture (at 81°C) was 
shifted back from 74 to 81°C, without going through 
gradual acclimation steps, the result was the immedi- 
ate washout of cells from the chemostat, indicative 
of a loss of thermotolerance. This study showed that 
gradual thermal acclimation of M. sedula could only 
extend the upper growth temperature limit of stable 
growth for this organism by 2°C (and this occurred at 
reduced cell densities). 


Pyrococcus furiosus 


In one of the few proteomics studies directed to- 
ward membrane-bound proteins (80), cell extracts 
of Pyrococcus furiosus were separated into soluble 
and insoluble fractions and analyzed by 2-DGE, and 


MALDI-TOF and micro-LC-ESI-MS/MS; a total of 
32 membrane proteins and 34 cytoplasmic proteins 
were identified. Based on bioinformatic analyses for 
signal peptides (SignalP, TargetP, and SOSUISignal) 
and transmembrane-spanning a-helices (TSEG, 
SOSUI, and PRED-TMR2), it was concluded that 23 
of the 32 proteins (72%) from the membrane fraction 
should be membrane associated, and that all of the 
proteins from the cytoplasmic fraction should be in 
the cytoplasm. In a separate study, 62 cytoplasmic 
proteins were separated by 2-DGE, and these were 
utilized for a systematic comparision to demonstrate 
the much greater efficiency of the micro-LC-MS/MS 
approach in identifying proteins, and particularly for 
smaller ones (<25 kDa), compared with the routinely 
used MALDI-TOF methods (109). However, the two 
approaches are complementary, and both should be 
utilized if possible. 


Thermoplasma acidophilum 


Proteomic-type, 2-DGE analyses have been ap- 
plied to T. acidophilum and demonstrated some un- 
expected eucarya-like properties of the organism. 
Cell-free extracts were fractionated, and low-molec- 
ular-weight proteins were systematically analyzed by 
N-terminal Edman sequencing. This led to the dis- 
covery of a ubiquitin-labeled peptide, suggesting that 
ATP-ubiquitin-dependent proteolysis occurs in this 
organism (189). In a related study from the same 
group, 2-DGE was used to analyze purified protea- 
somes from T. acidophilum (200). 


TRANSCRIPTOMICS 


The relatively small sizes (<2.5 Mb) of several ar- 
chaeal genomes, and the increasing availability of full 
genome sequences for these organisms, make DNA 
microarrays (155) a strategic tool for their study. 
While present knowledge of the biology of the third 
domain of life is still lagging in comparison with the 
Bacteria, the rapid maturation of microarray technol- 
ogy, in conjunction with improvements in bioinfor- 
matics and data analysis tools, should close this gap in 
the near future. Efforts to determine the transcriptome 
of archaea using DNA microarrays are reviewed in 
this section. 


Considerations for Genomewide 
Transcriptional Response Experiments 


One of the first realizations that arise from the 
use of DNA microarrays to investigate genomewide 
transcriptional responses is the sensitivity of gene ex- 
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pression to even minor (and sometimes unintended) 
perturbations in growth conditions. Thus, cultures 
must be grown, whether in batch or continuous mode, 
under carefully controlled conditions to reliably as- 
sess responses to intended environmental and nutri- 
tional perturbations. This may prove to be difficult 
for archaea that grow under extremes of temperature, 
pH, salinity, or strict anaerobic conditions. There are 
many issues to consider when choosing between batch 
and continuous culture for differential expression ex- 
periments with archaea. For example, a key advantage 
of chemostat culture is the ability to regulate cell den- 
sity and maintain nutrient-limited, steady-state growth 
for extended periods (164). However, chemostats are 
more difficult to operate than batch reactors. Fortu- 
itously, the extreme growth conditions characteristic 
of many archaea can eliminate issues with culture 
contamination that can plague continuous growth of 
conventional mesophilic microorganisms. Long-term 
chemostat operation is known to be affected by signif- 
icant wall growth; the seeding of the culture by sessile 
biofilm-associated cells can be problematic in inter- 
preting transcriptional response data (139). While 
biofilm formation has not been extensively examined 
in the Archaea, P. furiosus (140) and Thermococcus 
litoralis (144, 145) have been observed to form bio- 
films in chemotat culture. While batch cultures are 
typically easier to operate than chemostats, sampling 
times are a critical issue. Preferably, cells are perturbed 
and sampled during exponential phase in batch cul- 
ture. If these steps are performed too early or late, it 
may be almost impossible to separate growth phase 
and growth rate effects from the intended transcrip- 
tional response. Cell density should be carefully con- 
trolled since microbial behavior, such as cell signaling 
or aggregation, can be cell density dependent, as has 
been noted for cocultures of the hyperthermophilic bac- 
terium Thermotoga maritima and M. jannaschii (90). 

Although not often considered in functional ge- 
nomics experiments, gene expression can be growth 
phase dependent; the extent to which this is impor- 
tant for archaea compared with bacteria is not yet 
known. To illustrate the effect of growth phase on 
the transcriptome, a culture of S. solfataricus grown 
on peptide-based media at 80°C/pH 4.0, was exam- 
ined at mid-exponential, late-exponential, and early- 
stationary phases. Overall, almost 600 ORFs (ap- 
proximately 20% of the genome) were differentially 
expressed 2-fold or more between growth phases 
(Fig. 1) (S. Tachdjian and R. M. Kelly, unpublished 
results). 

To determine the time course response of S. sol- 
fataricus to heat shock, a 2-liter batch culture was 
grown to mid-exponential phase at 80°C, and the 
temperature was then shifted to 90°C within 10 min. 


The culture was sampled 10 minutes before tempera- 
ture shift was initiated, and then at 5, 30, and 60 min 
after the temperature had reached 90°C. As antici- 
pated, the results demonstrated that many genes re- 
spond dynamically to heat shock. For example, more 
than one third of the genome was found to be differ- 
entially expressed within 5 min of the culture reach- 
ing 90°C, but very little differential expression oc- 
curred between 30 and 60 min (see Fig. 1). Similar 
results were obtained after heat shock of the hyper- 
thermophilic bacterium T: maritima, where differen- 
tial gene expression patterns could be categorized 
into immediate, short-term, and long-term responses 
(142). As discussed below, the same was true for the 
cold shock response (from 95 to 72°C) of P. furiosus, 
although the low temperature slowed down the ac- 
climation time, which occurred over a 5-h period 
(188). Failure to track time-dependent gene expres- 
sion patterns can lead to inaccurate interpretation of 
biological phenomena. For example, in a S. solfatari- 
cus acid shock experiment (pH 4.0 to 2.0) similar to 
the heat shock experiment described in this section, 
less than 2-fold changes were observed for the hypo- 
thetical proteins SSO0337 and SSO2887 after 60 min, 
in comparison with pre-acid shock levels (Fig. 2). 
However, SS00337 was more than 8-fold upregu- 
lated at 5 min and 2-fold upregulated at 30 min com- 
pared with pre-acid shock values. On the other hand, 
SSO2887 was unaffected after 5 min, downregulated 
3.6-fold and 2.2-fold after 30 and 60 min, respec- 
tively. Clearly, transcriptional response is significant 
for both ORFs although this was not clear from “be- 
fore-and-after” samples comparing the 60-min and 
pre-acid shock expression levels. 


Transcriptional Response Analysis of Archaea 


Although DNA microarrays have only begun to 
be used for the study of archaea, interesting insights 
into specific aspects of the biology of these organisms 
have already been obtained from representatives of 
most of the major types, including the methanogens, 
halophiles, and hyperthermophiles. Studies are de- 
scribed, commencing with organisms that have been 
the most studied to date, the heterotrophic anaerobe, 
P. furiosus. 


Pyrococcus furiosus 


Influence of sulfur on P. furiosus metabolism. 
The first application of DNA microarrays to study the 
metabolism of either an archaeon or a hyperther- 
mophile was reported in 2001 using P. furiosus (157). 
An array was constructed using full-length PCR prod- 
ucts to 271 of the 2,065 genes annotated in its genome. 
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Heat shock response in S. solfataricus 
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Figure 1. Time-dependent response of Sulfolobus solfataricus to a temperature shift from 80°C to 90°C (Tachdjian and Kelly, 
unpublished). Histogram (top) and Venn diagram (bottom) represent the number of genes differentially expressed more than 
2-fold at each stage of the heat shock. HS, heat shock. 
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Figure 2. Time-dependent differential expression of ORFs $SO0337 and SSO2887 from Sulfolobus solfataricus subjected to a 
pH shift. ORFs $SO0337 (A) and SSO2887 (B) were subjected to a pH shift from pH 4.0 to pH 2.0 at 80°C (Tachdjian and 
Kelly, unpublished). Samples were taken 1 min before pH shift was initiated, then 5, 30, and 60 min after reaching pH 2.0 (lines 
drawn through points). Significant levels of differential expression occurred at intermediate sampling times, although no sub- 
stantial changes were observed for the pre-acid shock versus 60-min contrast. 


The genes included those that are predicted to encode 
proteins mainly involved in the pathways of sugar 
and peptide catabolism and in the utilization of metals 
(such as Fe, Ni, W, and Mo) and the biosynthesis of 
cofactors, amino acids, lipids, polyamines, ribosomes, 
and nucleotides, together with putative chaperonins, 
ATPases, and transcriptional regulators. P. furiosus 
obtains energy for growth by converting either sugars 
or peptides to organic acids, CO3, and hydrogen. Cells 


grow well in the absence of S° but if it is present, it is 
reduced to H3S. For the array analysis, a batch com- 
parison was carried out where RNA was derived 
from cells grown in the presence and absence of S°. 
The expression of 21 of the 271 genes decreased dra- 
matically (>5-fold) in the presence of S°, and of these, 
18 encoded subunits associated with the three differ- 
ent hydrogenase systems that have been characterized 
from P. furiosus. The other three encode homologs 
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of ornithine carbamoyltransferase and Hyp, both of 
which appear to be involved in hydrogenase biosyn- 
thesis, as well as a conserved hypothetical protein. 
The presence of S° resulted in the upregulation by 
more than 2-fold of the expression of two previously 
uncharacterized genes. Their products were termed 
SipA and SipB (for sulfur-induced proteins), and it was 
proposed that they are part of a novel S°-reducing, 
membrane-associated, iron-sulfur cluster-containing 
complex. For example, SipB was predicted to contain 
two [4Fe-4S]] clusters that might be involved in pro- 
viding electrons for S° reduction. Other genes whose 
expression was upregulated by S° encoded a putative 
flavoprotein and a second putative iron-sulfur protein, 
both of which may also be involved in S° reduction. 
However, note that the mechanism by which P. furio- 
sus generates H,S is still largely unknown. Although 
the DNA microarray approach generates strictly com- 
parative results, the fluorescence intensities associated 
with a given spot on the array gives a qualitative in- 
dication of the degree to which the relevant gene is 
expressed. Consequently, it was shown that, of the 
twenty (S°-independent) genes that appeared to be the 
most highly expressed (at more than 20 times the de- 
tection limit), twelve of them encoded enzymes that 
had been previously purified from P. furiosus biomass, 
indicating that their protein products are among the 
most abundant within the cell. Conversely, none of 
the products of the thirty-four (S°-independent ORFs) 
that were not expressed above the detection limit had 
been characterized biochemically. As yet, the effects 
of S° on the expression on all genes encoded by the 
P. furiosus genome has not been reported. 


P. furiosus transcriptional response to thermal 
stress. The effect of supraoptimal temperatures on 
gene expression in P. furiosus was examined with 
a targeted DNA microarray containing 201 genes 
(165). These were predicted to encode proteins in- 
volved in proteolysis, the stress response, proteolytic 
fermentation, and glycoside hydrolysis. When a cul- 
ture growing with peptides as the carbon source was 
shifted from 90 to 105°C for one hour, differential ex- 
pression (4-fold or higher) of several genes was noted, 
including the thermosome (PF1974), small heat shock 
protein (PF1883), and two other putative molecular 
chaperones (PF0963 and PF1882, CDC-48 homologs, 
VAT-related). Of the 42 protease-related genes moni- 
tored, only two were differentially expressed more 
than 5-fold; pyrolysin (PFO287) was downregulated, 
while a subtilisin-like protease (PF0688) was upreg- 
ulated. The two ATP-dependent proteases in P. furio- 
sus responded to heat shock in different ways. The 
Lon protease (PF0467) was downregulated (3-fold), 
and while the expression of the genes encoding the 


two proteasome B-subunits (PF0159 and PF1404) 
were slightly stimulated (2-fold), that encoding the a- 
subunit (PF1571) was downregulated (4-fold), and the 
proteasome ATP-dependent regulatory subunit (PAN 
PF0115) was not affected by heat shock. This result 
raises questions concerning 20S proteasome assembly 
and function during thermal stress, and cellular abun- 
dance of this complex. Genes related to proteolytic 
fermentation were among the most highly expressed 
under normal growth conditions and were either not 
affected or were downregulated to varying extents 
upon heat shock. Compatible solute formation under 
thermal stress was indicated by the induction of genes 
encoding a putative trehalose synthase (PF1742) and 
L-myo-inositol 1-phosphate synthase (PF1616), the 
latter result being obtained through Northern analy- 
sis. Mannosy] glycerate formation (e.g., PFO591) was 
not induced. Several glycosidase genes were signifi- 
cantly induced upon heat shock (e.g., PFO073, PFO076, 
PF0442), as were genes encoding maltose/trehalose- 
binding proteins (e.g., PF1739 and PF1938). Saccha- 
ride recruitment involving glycosidases during thermal 
stress may serve to meet the increased bioenergetic 
needs of the cell or be recruited for compatible solute 
synthesis. 


Influence of carbon source on P. furiosus metabo- 
lism. The first complete gnome DNA microarray to 
be constructed for a hyperthermophile or for a non- 
halophilic archaeon was also for P. furiosus (156). The 
array contained PCR products for the 2,065 genes 
that have been annotated in the genome; all but 105 
of which were full-length (the exceptions were ap- 
proximately 1 kb). A batch comparison was carried 
out where cells were grown at 95°C using either pep- 
tides (hydrolyzed casein) or the disaccharide maltose 
as the primary carbon source in the presence of ele- 
mental sulfur. In addition to providing information 
on changes in gene expression, the DNA microarray 
approach can be used to assess which genes are not sig- 
nificantly expressed, and this was the case for approx- 
imately 20% (398 of 2065) of genes in both growth 
conditions. For the remaining 80%, the focus was on 
genes whose expression appeared to be strongly regu- 
lated (signal intensity changed by at least 5-fold) by 
the presence of peptides or maltose. As illustrated in 
Fig. 3, expression of these genes is essentially shut 
down under one or the other condition. This applied 
to a total of 125 genes, most of which (82 of 125) 
were part of 27 clusters of two or more adjacent genes, 
indicating the extensive and coordinated regulation of 
the expression of putative operons. A total of 18 oper- 
ons were upregulated by growth on maltose and 9 by 
growth on peptides, although 5 of the 27 operons en- 
coded only (conserved) hypothetical proteins and so 
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Figure 3. Fluorescence intensities of Pyrococcus furiosus DNA microarrays. (A) cDNA versus cDNA derived from two inde- 
pendent cultures of cells grown with peptides as the carbon source. (B) cDNA versus cDNA derived from two independent 
cultures of cells grown with peptides or maltose as the carbon source. In A, the upper and lower diagonal line pairs indicate 
twofold and fivefold changes in the signal intensities, respectively, while only the lines indicating fivefold changes are given in 
B. See text for details. Reproduced from the Journal of Bacteriology (156) with permission of the publisher. 
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their functions could not be assessed. Those operons 
regulated by maltose and its metabolites include those 
that are responsible for the biosynthesis of 12 amino 
acids (Glu, Arg, Leu, Val, Ile, Ser, Thr, Met, His, Phe, 
Trp, and Try), of ornithine, of citric acid cycle inter- 
mediates, and for maltose transport. On the other 
hand, operons that are upregulated in peptide-grown 
cells included those encoding enzymes involved in the 
production of organic acids and 2-ketoacids from the 
amino acids serving as carbon and energy sources. 
These findings were consistent with the pathways that 
had been proposed for amino acid degradation in- 
volving transamination and production of energy- 
yielding CoA derivatives (6). This was confirmed ex- 
perimentally by finding that acetate was the major 
acid in spent media for maltose-grown cells, in con- 
trast to isovalerate, butyrate, isobutyrate, and pheny- 
lacetate in spent media of peptide-grown cells (157). 
Several sugar metabolism genes that did not appear to 
be part of operons were highly regulated. These in- 
cluded three in peptide-grown cells unique to the glu- 
coneogenic pathway, and three in maltose-grown cells 
that are unique to the novel glycolytic pathway found 
in this organism (182). The microarray results also 
provided an unexpected insight into the regulation of 
the hydrogenases of P. furiosus (156). In maltose- 
grown cells, the expression and the activities of the 
two cytoplasmic (I and II) and one membrane-bound 
enzyme decrease if S° is present in the growth medium 
(6, 156). However, the genes encoding the four sub- 
units of one of the two cytoplasmic hydrogenases, 
hydrogenase I, were expressed in peptide-grown cul- 
tures, even though S° was present, and the hydroge- 
nase activity in the cell extracts remained very low. 
The role of what appears to be inactive hydrogenase 
I in peptide-grown cells is still not understood. 


Response of P. furiosus to cold shock. While the re- 
sponse of hyperthermophilic organisms to supraopti- 
mal growth temperatures has been the subject of sev- 
eral investigations, as discussed herein, the effects of 
suboptimal temperatures on archaeal physiology and 
metabolism has received little attention. This is of 
some interest as genomes of archaea generally lack the 
canonical cold shock protein A (CspA) and ribosomal- 
binding factor A (RbfA) found in bacteria, proteins 
that are thought to enhance ribosomal function at low 
temperature (56). The first such investigation of a 
member of the Archaea using DNA microarrays mon- 
itored the effects of growing P. furiosus at the lower 
end of its temperature range for significant growth, 
72°C (doubling time ~5 h) and at 95°C, which is near 
its optimal growth temperature (doubling time ~1 h) 
(188). The study included both a batch comparison 
of cells grown at the two temperatures and a kinetic or 


shock experiment where cultures were shocked by 
rapidly dropping the temperature from 95 to 72°C. 
This shock resulted in a 5-h lag or acclimation phase, 
during which time little growth occurred. Transcrip- 
tional analyses showed that cells undergo three very 
different responses at the suboptimal temperature: an 
early shock (over ~2 h), a late shock (at 5 h), and an 
adapted response (seen in the batch comparison, oc- 
curring after many generations growing at 72°C). 

In general, at the suboptimal growth tempera- 
ture, much less mRNA was present for certain cellu- 
lar processes (188). For example, after 1, 2, and 5 h, 
there were 171, 69, and 152 genes, respectively, 
downregulated by >2.5-fold. On the other hand, 
there were several genes whose expression was sig- 
nificantly upregulated at 72°C, and their products are 
assumed to be involved in the processes by which the 
cells become acclimated to the lower temperature. 
After 1, 2, and 5 h at 72°C, and in cells adapted to 
72°C, there were 49, 35, 30, and 59 ORFs, respec- 
tively, that were significantly upregulated (>2.5- 
fold). These encoded proteins involved in translation, 
solute transport, amino acid biosynthesis, and tung- 
sten and intermediary carbon metabolism, as well as 
numerous conserved/hypothetical and/or membrane- 
associated proteins. The overall response to subopti- 
mal temperatures is summarized in Fig. 4. A priority 
for the cell appears to be the production of amino 
acids, by transport, intracellular proteolysis and bio- 
synthesis, for the production of cold-specific proteins. 
Some are involved in translation and influencing 
DNA topology, but a major problem in understand- 
ing the overall response is that many of the genes are 
of unknown function. For example, of the 22 puta- 
tive cold-responsive operons that were identified, 
9 contain exclusively or predominantly conserved hy- 
pothetical proteins. 

Another characteristic of the cold response is the 
upregulation of a significant number of genes pre- 
dicted to encode membrane-bound proteins (188). 
For example, this includes 21 of the 59 genes upreg- 
ulated in cells adapted to 72°C (and 34 of the 59 fall 
into the conserved hypothetical category). Analysis of 
such cells by one-dimensional SDS-gel electrophoresis 
revealed two major membrane proteins, which stain- 
ing indicated were glycoproteins. They were termed 
CipA (PF0190) and CipB (PF1408), and their cold- 
induced expression was evident from the DNA mi- 
croarray data and was confirmed by real-time (RT-) 
PCR. Both appear to be solute-binding transport pro- 
teins that are related phylogenetically to some cold- 
responsive genes identified in certain bacteria (23). It 
was postulated that the Cip proteins may represent a 
general prokaryotic-type cold shock response mech- 
anism that is present in bacteria and archaea, and 
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Figure 4. Cellular processes involved in the cold shock response of Pyrococccus furiosus when grown at 72°C rather than at the 
optimal temperature near 100°C. Modified from the Ph.D. thesis (187) with permission of the author. 


even in hyperthermophilic archaea, although further 
evidence is needed to substantiate this (188). 


Methanogens 


It is perhaps surprising that, despite the consid- 
erable amount of proteomic analyses that have been 
carried out with methanogenic species (see “Proteo- 
mics of methanogens” above), there have been only 
two transcriptional profiling studies, both reported in 
2005. One (22) examined a temperature response of 
M. jannaschii, while the other (82) evaluated the re- 
sponse of M. mazei to different carbon sources. While 
these data should allow comparisons to be made with 
proteomic analyses, unfortunately, the corresponding 
experiments have not as yet been reported. 


M. jannaschii. M. jannaschii grows optimally at 
85°C, and its response to nonlethal cold shock (65°C) 
and lethal heat shock (95°C) was studied using mi- 
croarrays made of PCR products representing 99% 
of the 1,738 genes annotated on the genome (22). A 
total of 95 genes were differentially regulated by the 
heat shock, including those encoding both ATP-de- 
pendent and -independent chaperones. In contrast, a 
much more general response was seen to a decreased 
temperature, with 345 genes showing differential reg- 


ulation, more than 170 of which are conserved/hypo- 
thetical. Eight of the temperature-responsive genes 
were examined by RT-PCR, which showed on average 
a threefold greater change in the response than in the 
array data. The known genes regulated by the cold 
shock included many involved in the processes noted 
above in the general cold response of P. furiosus (188), 
such as maintenance of DNA topology, transcription, 
and translation. The proteins included topoisomerase, 
DNA helicase, RNA helicase, ribosomal subunits, 
prolyl isomerase, and proteases. However, in contrast 
to P. furiosus, an increase in amino acid biosynthesis 
was not observed, and specific changes to membrane- 
bound proteins were not identified. Direct comparison 
between these two archaea is not straightforward, 
however, as the M. jannaschii data represent one time 
point after the cold shock, and acclimation and adap- 
tation phases were not distinguished. In M. jannaschii, 
the genes involved in methanogenesis and an ATP syn- 
thase, which represent the primary pathway for en- 
ergy conservation, were downregulated about 2.5-fold 
at the lower temperature. 


M. mazei. The genome of M. mazei is 4.1 Mb in 
size and encodes 3,371 genes (47). PCR products 
(200 to 100 bp) were used to construct a microarray 
representing 3,269 (97%) of the genes. Of these, 2,480 
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(76%) showed signficant expression in cells grown 
on methanol or acetate. This is similar (80%) to the 
number of genes that were reported to be expressed 
in P. furiosus when assessed by the microarray ap- 
proach (156). Comparison of cells grown with 
methanol or acetate as the carbon source identified a 
total of 317 genes that displayed either an increased 
level (155 genes) or decreased level (162 genes) of ex- 
pression (>2.5 fold). More than 70% of them were 
organized into potential operons. Many reflected 
known or suspected differences in the pathways of 
methanogensis and energy conservation using the 
two carbon sources, and the up- and downregulation 
of several key enzymes was confirmed by RT-PCR. 
Some unexpected findings included the higher require- 
ment for aromatic amino acids and the upregulation 
of peptide transporters, flavodoxins, and ferredoxins 
in acetate-grown cells, as well as the regulation of more 
than 100 genes of unknown or undefined function. 
This is the first comprehensive analysis of how gene ex- 
pression in a methanogenic archaeon varies with car- 
bon source as determined by mRNA levels, and a cor- 
responding proteomic analysis would be extremely 
informative. 


Halophiles 


Halobacterium sp. NRC-1. The first global analysis 
of the proteome of a haloarchaeon from an identifi- 
cation perspective was for Halobacterium sp. NRC-1 
(14). This seminal study also included the first whole- 
genome transcriptional analysis for an archaeon using 
arrays containing PCR products representing 2,413 
genes. Expression profiles were compared for the wild 
type and for three mutant strains that either lacked or 
overproduced bacteriorhodopsin. Each sample pair 
was analyzed four times independently, and it was 
stated that statistically significant changes in expres- 
sion level of 50% could be detected, although this 
seems a somewhat optimistic assertion. The muta- 
tions that led to a greater amount of purple mem- 
brane resulted in the differential regulation of 151 of 
the 2,682 genes examined (66 upregulated and 85 
downregulated) relative to the wild type. A surpris- 
ing number of regulated genes (98 of 151, or 65%) 
were of unknown function (compared with 41% of 
unknown function in the complete genome). The genes 
with annotated functions were involved in phototro- 
phy and membrane synthesis, which were upregulated. 
In contrast, genes involved in arginine fermentation 
(the alternative pathway by which this organism con- 
serves energy), were downregulated. Very few (7 of 50) 
of the genes that appeared to be regulated according to 
the proteome analysis showed corresponding changes 
at the transcriptional level. 


Very recently, whole-genome DNA microarrays 
were also used to investigate the discovery that 
Halobacterium NRC-1 can respire anaerobically us- 
ing N- and S-oxides as terminal electron acceptors 
(124). Each of the 2,473 genes was represented on the 
array by up to three 60-mer oligonucleotides. A total 
of 104 genes were upregulated, and 137 were down- 
regulated at least twofold during anaerobic respira- 
tion. These included 24 of the 109 genes in this or- 
ganism that are predicted to be involved in energy 
metabolism and included a 3-fold change in expres- 
sion of the dms operon, which encodes the proteins 
involved in DMSO reduction. The expression of al- 
most all of the genes involved in aerobic respiration 
were found to be unaffected by anaerobic growth, in- 
dicating that the organism is prepared to metabolize 
oxygen, even under anaerobic conditions. 


H. volcanii. The metabolism of H. volcanii was 
investigated by a novel variation of the DNA micro- 
array approach that did not utilize the genome sequence 
(which at the time of writing is still not available; 
www.genomesonline.org). A PCR-product library 
was generated from a genomic library, and this was 
used to construct a 2,880-spot, one-fold-coverage array 
(where each spot represented less than an entire gene) 
(193). About 10% of the spots (273 of 2,880) indi- 
cated a more than 2.5-fold change in expression of 
the representative gene when the carbon source for 
growth was changed from peptides to glucose. The 
clones represented by the 273 spots were sequenced, 
and the genes were identified by comparison with 
available databases, which included the genome se- 
quence of H. salinarum. The results for many genes 
were rationalized in terms of their predicted roles in 
transport, the central metabolic pathways, and growth 
phase. However, many regulatory patterns were unan- 
ticipated, including several genes encoding enzymes 
involved in metabolizing acetate and 2-ketoacids. It is 
estimated that, by this onefold coverage approach, 
there is a one-in-three chance that a gene will not be 
represented. Nevertheless, this study does demon- 
strate that a shotgun genomic library can be utilized 
to give important insights into global regulatory path- 
ways and can do so in the absence of a complete ge- 
nome sequence for the target organism. 


A. fulgidus 


Regulation of heat shock response in A. fulgidus. 
The heat shock response of A. fulgidus (which grows 
optimally at 83°C) to a temperature shift from 78 to 
89°C was studied with a whole-genome cDNA mi- 
croarray (based on 500 to 2,000 bp PCR products) 
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(148). Significant changes in expression between 5 
and 60 min after the temperature change were ob- 
served for 350 genes, 189 of which were up-regulated 
and 161 down-regulated (although the fold-changes 
and the statistical significance were not stated). One 
of the most highly expressed ORFs, AF1298, ap- 
peared to be upregulated ~10-fold; this showed sim- 
ilarity to a transcriptional regulator identified previ- 
ously in P. furiosus (PF1790) (183). Electrophoresis 
mobility shift and footprinting assays demonstrated 
that (the recombinant form of) AF1298 was a DNA- 
binding protein and bound to regions upstream of 
two A. fulgidus genes, AF1298 and AF1971. It was 
proposed that AF1298 is part of an operon that en- 
codes a small heat shock protein, Hsp20 (AF1296) 
and a cell division control protein (cdc48, AF1297). 
By sequence analysis, AF1298 is only distantly related 
to the P. furiosus protein (PF1790), and it is thought 
that they may be members of a diverse protein fam- 
ily that mediates a heat shock response in hyperther- 
mophilic archaea. A. fulgidus is known to respond to 
heat shock by increased production of organic solutes, 
such as di-myo-inositol-1,1’ (3,3’)-phosphate (DIP), 
and to change the composition of membrane iso- 
prenoid lipids. However, with one exception (3-hy- 
droxy-3-methylglutaryl-CoA reductase, AF1736), the 
genes encoding enzymes relevant to these pathways 
did not show significant changes in expression (148). 


STRUCTURAL GENOMICS 
What Is Structural Genomics? 


In the 1990s, DNA-sequencing efforts leapt for- 
ward, beginning symbolically in 1995 with the deter- 
mination of the first complete genome from a free- 
living organism, the bacterium Haemophilus influenzae 
(59). Ten years later, there are currently ~300 com- 
plete bacterial or archaeal genome sequences avail- 
able, with more than twenty from the Archaea (85) 
(see Chapter 19). In the late 1990s, discussions be- 
gan on one of the next logical extensions of genomic 
research, understanding the proteome at the struc- 
tural level. This evolved into the concept of “struc- 
tural genomics” (SG), large-scale, rapid determina- 
tion of the three-dimensional structures of proteins 


using nuclear magnetic resonance (NMR) and X-ray 
crystallography techniques, with the lofty goal of de- 
termining the structure of a representative fold from 
all possible families of protein folds in the living 
world (26, 62, 98, 149, 161, 175). It was estimated 
that approximately 5,000 to 15,000 different struc- 
tures may be necessary to cover the estimated 1,000 
to 3,000 unique folds that may be represented in liv- 
ing organisms (62), although this estimate is being 
constantly revised (110). Such an undertaking would 
require extensive development of novel high-through- 
put (HTP) technologies at the levels of bioinformatics, 
gene cloning, heterologous protein expression (both 
screening for expression and milligram level produc- 
tion), crystallization, structure determination (NMR 
and X-ray), and data processing. In the United States, 
beginning in 2000, the National Institute of General 
Medical Sciences (NIGMS/NIH) created a five-year 
Protein Structure Initiative (PSI) (128, 138), devoted 
to development of new tools and approaches for au- 
tomating and increasing the rate of all these methods 
required for rapid production of protein structures. 
Nine centers in the United States (17, 25, 28, 128, 138, 
174), along with other international group efforts 
(67, 77, 86, 170), set out to develop these new tech- 
nologies in projects that focus in some cases on par- 
ticular target organisms, and in some, on particular 
target protein families using homologs from multiple 
organisms. 

It was well known that recombinant protein 
production presents several significant obstacles. Ini- 
tially, the bacterium E. coli was the host organism of 
choice due to the many well-developed tools available 
for heterologous protein expression (15, 16). High- 
throughput gene cloning turned out to be remarkably 
simple, with virtually 100% success rates (17) whether 
using standard ligation-based cloning (89, 150) or 
commercially available recombination-based cloning 
systems (117, 184). However, there is a remarkable 
attrition rate following cloning (Table 3) independent 
of research group, protocols used, and target list (33, 
38, 105, 160). About 20% of targeted proteins are 
successfully expressed and purified, 37% of purified 
proteins crystallize, half of these give diffraction qual- 
ity crystals, and 25% of these diffracting crystals fail 
to yield a structure. However, this is not to say the SG 
efforts have not yielded tremendous amounts of use- 


Table 3. Success rates for the various steps in the SG protocol from gene to protein structure (137) in the Protein Data Bank (17) 


Status Cloned Expressed Soluble Purified Crystallized Diffraction NMR assigned Crystal NMR In PDB 
structure structure 
No. of targets 54,170 31,388 13,241 11,164 4,126 2,123 610 1,551 508 1,985 
% Success 100.0 57.9 24.4 20.6 7.6 3.9 Vet 2.9 0.9 3.7 
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ful information. As of August 2005 (after almost five 
years of effort), more than 2,100 structures at atomic 
resolution have been deposited in the Protein Data 
Bank (PDB) by SG groups throughout the world (24, 
137, 177). 

Why do so many proteins fail along this path- 
way? One class of proteins is well known to give sig- 
nificant problems for overexpression and crystal- 
lization: the membrane proteins (91). Initially most 
centers removed membrane proteins from their tar- 
get list (these typically represent about 25% of the 
ORFs in a genome). However, more recently experi- 
ments have been initiated for HTP expression and pu- 
rification of these difficult proteins (51, 57, 103, 114). 
In addition to problems with membrane proteins, a 
large number of nonmembrane proteins still fail to 
be heterologously expressed in E. coli (5) due to a 
lack of various posttranslational modifications. Im- 
portant posttranslational modifications include for- 
mation of heteromeric protein complexes, proteolytic 
processing, modifications such as phosphorylation or 
acetylation, and addition of organic (such as flavin) 
or inorganic (for example, the metals Fe and Zn) co- 
factors (5, 89, 171). As many as a third of the ORFs 
in a given bacterial or archaeal genome could be part 
of an operon, and thus potentially the protein prod- 
ucts may form a heteromeric protein complex, or at 
least coexpression may be required for assembly. This 
is compounded by the difficulty in predicting operons 
(55). Considering metal cofactors alone, approxi- 
mately one-third of all structurally characterized pro- 
teins contain a metal cofactor, and perhaps as many 
as half of all proteins could contain metal (81, 89). 
Given these caveats, the success rate in producing 
folded, stable proteins from a random selection of 
ORFs from a genome was originally predicted to be 
~20% (5), which is consistent with the observed suc- 
cess rate (Table 3). In the United States, NIGMS has 
instituted a second five-year Protein Structure Initia- 
tive, with centers focused on HTP protein and struc- 
ture production, and centers specifically tasked with 
addressing these difficult expression problems (often 
termed the “high-hanging fruit”) (138, 159). 


Archaeal Targets 


Many SG groups (not just those examining ar- 
chaea) have targeted specific organisms (5, 72), al- 
though it is clear that when expression of a gene and/or 
purification or crystallization of the protein fails, an 
ortholog from another organism may succeed (131, 
154). Thus, specific archaea have been targeted (5, 
38, 115), in addition to numerous proteins being cho- 
sen from archaea solely because they possess an or- 
tholog of the protein of interest. Genes from hyper- 


thermophilic archaea (and bacteria) are particularly 
useful because, even if the proteins they encode do not 
crystallize more easily than those from mesophiles (an 
anecdotal belief), they are certainly more stable than 
mesophilic proteins and thus easier to work with. 
Some of the pioneering work in SG was performed 
using archaeal target organisms, in particular, the hy- 
perthermophiles Pyrobaculum aerophilum (115) and 
Methanothermobacter thermautotrophicus (38). Some 
of the earliest work on target selection was performed 
using the P. aerophilum genome by attempting to as- 
sign protein folds to each ORF and predict which may 
contain novel folds (115). P. aerophilum is no longer 
a primary target as the center working on this organ- 
ism moved on to the disease-causing organism My- 
cobacterium tuberculosis, and more recently, to a 
more technology-based approach to focus on produc- 
ing properly folded proteins regardless of source (In- 
tegrated Center for Structure and Function Innova- 
tion) (138). Nonetheless this has resulted in a number 
of structures (e.g., 11, 44, 125, 136). 

One of the first uses of SG was to assign a bio- 
chemical function to a hypothetical protein based on 
a structure determination. This was performed using 
a M. jannaschii protein (194). M. jannaschii was orig- 
inally used as a test case by the Berkeley center (194), 
which has since shifted focus to structures of proteins 
from Mycoplasma species (99). The M. jannaschii 
project is still active, and 144 proteins are listed as SG 
targets in the PDB, mainly targeted by the Northeast 
Consortium (191). The PDB lists 33 M. jannaschii 
structures completed by SG groups, for example (83, 
97, 118). There are also several structures from the 
related species M. maripaludis deposited in the PDB 
but as yet unpublished (1YVC, 1YWX). 

The first (and most complete) archaeal structural 
genomics project was derived from pioneering work 
by a group based at the Ontario Cancer Institute at the 
University of Toronto. Using M. thermautotrophicus 
as the model organism, the goal was to test and be- 
gin development of the tools needed for HTP protein 
production and structure determination (38). A 
group of 424 (nonmembrane) proteins was targeted, 
and ~20% were produced in sufficient quantity and 
quality for structural analyses by either NMR or crys- 
tallography (38). The PDB TargetDB lists 213 current 
M. thermautotrophicus targets and 49 structures de- 
rived from SG groups (e.g., 30, 45). 

One of the most significant archaeal SG efforts 
involves the hyperthermophilic archaeon P. furiosus 
(5). As one of three target organisms (along with 
Caenorhabditis elegans and Homo sapiens) of the 
Southeast Collaboratory for Structural Genomics 
(SECSG) (168), the goal of this laboratory was to tar- 
get all the ORFs in this organism and to express them 
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to yield proteins in a fully folded, functional form. As 
a free-living nonnucleated cell with a small genome 
(1.9 Mb representing ~2,000 ORFs), P. furiosus is an 
excellent candidate for attempting a genomewide SG 
project (5). To ensure that proteins that contain co- 
factors and/or are part of multiprotein complexes are 
expressed under conditions that enable them to fold 
properly, cells are grown, for example, in the presence 
of excess Fe or Zn for cofactors, or by coexpression 
of multiple ORFs for complexes. One of the first key 
issues addressed by the SECSG was precisely how 
many ORFs to target for cloning and expression. The 
original annotation of the P. furiosus genome de- 
posited in Genbank defined 2,065 putative ORFs. 
However, depending on the annotation, the number of 
ORFs predicted was as high as 2,261 (133). As a re- 
sult, the initial effort targeted 2,192 ORFs for cloning. 
Considering the difficulties in obtaining recombinant 
forms of membrane proteins, cofactor-containing pro- 
teins, and proteins that may only be stable when co- 
expressed with their partners from a multiprotein 
complex, it was anticipated that only about 20% of 
the target genes would yield stable, properly folded 
proteins (i.e., the “low-hanging fruit”). To date, 1,909 
ORFs (168) have been cloned into an expression vec- 
tor containing an amino-terminal affinity tag (MAH- 
HHHHGS-), which allows purification by immobi- 
lized metal affinity chromatography (IMAC), as well 
as detection using an enzyme-linked immunosorbent 
assay (ELISA) with a commercial antibody against the 
affinity tag (171). 

Screening thousands of clones at a preparative 
scale (1 liter culture) is prohibitive in terms of time 
and cost. As a result, screening was performed in a 
small scale (1 ml) expression system (SSE) using ro- 
botics to automate screening for protein expression (5, 
171). Screens were performed using several different 
growth conditions known to affect protein expression, 
such as culture medium, temperature, E. coli strain, 
etc. Using the SSE, 1-ml cultures are grown and het- 
erologous protein expression is induced in deep-well 
96-well plates overnight. The cultures are harvested 
by centrifugation, and the cells are lysed and then frac- 
tionated using vacuum filtration into a soluble frac- 
tion and an insoluble inclusion body fraction (solubi- 
lized with 6 M guanidine), with a whole-cell extract 
fraction retained for screening for membrane proteins. 
Using an antibody against the affinity tag, expression 


of the target protein (under a specific growth condi- 
tion) can be detected and the fraction in which it is 
expressed can be determined (water-soluble form, or 
insoluble, presumably as an inclusion body) (89). A 
number of recent reports demonstrate the usefulness 
of similar types of HTP screens for protein expression 
(e.g., 1, 29, 40, 79, 113, 127). 

These results for the expression screens are used 
to scale up production to (typically) a 1-liter scale for 
purification of milligram amounts of protein, which 
can then be used for both X-ray crystallography and 
NMR structural analyses. For clones that fail to ex- 
press (or express an insoluble, presumably unfolded 
protein), increasingly complex protocols are followed 
to attempt to obtain suitable expression and protein 
quality; for example, alternative E. coli expression 
strains, recloning with different expression vectors or 
affinity tags, different host organisms, etc. For P. fu- 
riosus, 2,373 cultures representing 1,006 unique ORFs 
(and 1,367 additional attempts to express these unique 
ORFs under different growth conditions) have so far 
been scaled up to 1 liter, of which 57% (575) success- 
fully expressed some protein (as determined by SDS- 
PAGE after one affinity column). Three hundred eighty- 
five unique proteins have been purified, and 259 of 
these have been analyzed by mass spectrometry analy- 
sis and match the predicted mass of the target pro- 
tein (i.e., not degraded, or posttranslationally modi- 
fied by E. coli). A total of 240 have been submitted 
for X-ray crystallography screening and 137 have been 
sent for NMR screening (Table 4) (168). A summary 
of output from all SG groups can be found at the Pro- 
tein Data Bank (137). At the time of writing (August 
2005), 89,721 targets have resulted in 54,530 clones, 
11,632 purified proteins, and 1,926 X-ray and 782 
NMR structures. 

In addition to P. furiosus, 31 targets from P. abyssi 
have been registered in the PDB and 11 structures have 
been completed but not yet released (136). P. þori- 
koshii has 124 targets registered by a number of dif- 
ferent centers and 45 structures available (e.g., 10, 53, 
172), and a large number of presently unpublished and 
unreleased structures (136). Other major archaeal 
SG projects (again, typically using archaea as sources 
of orthologous genes) are Sulfolobus solfataricus (328 
ORFs targeted) and other Sulfolobus spp. (25 ORFs 
targeted), which are primarily being performed at the 
Northeast Consortium (191) and the Joint Center for 


Table 4. August 2005 production statistics for Pyrococcus furiosus proteins from gene to protein structure (137, 168) 


Genes Expression Expression Pure i X-ray NMR NMR 

: Crystal Diffracts 
cloned attempt success protein structure folded structure 
1,909 1,106 575 385 108/259 59 29 112/137 2 
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Structural Genomics (JCSG) (105); only a few struc- 
tures are available (35, 129), although many are com- 
pleted but unreleased (136). Archaeoglobus fulgidus 
SG (197 targets) is occurring primarily at the JCSG 
(105), the Midwest Center for Structural Genomics 
(MCSG) (100), and the New York Consortium (NYS- 
GXRC) (176). In addition, 50 proteins from other 
Archaeoglobus sp. have been targeted with 41 struc- 
tures available (e.g., 106, 153) and a large number un- 
released (136). 

More than 300 ORFs from Thermoplasma aci- 
dophilum and T. volcanium have been targeted by at 
least five different groups, with 20 structures produced 
so far (e.g., 101, 163). Aeropyrum pernix has been 
targeted by a large number of groups (344 targets), and 
six structures are complete. A few ORFs have been 
targeted from H. salinarum, Halobacterium sp. NRC-1, 
M. mazei, and Ferroplasma acidarmanus (137). In ad- 
dition to archaea, some extremophilic bacteria are 
major SG targets; for example, T. maritima (one of the 
pioneering projects) (48, 105) and Aquifex aeolicus. 
The above-mentioned SG projects represent a snap- 
shot of a rapidly changing, ongoing process that will 
continue to target archaea until the primary SG goal 
of “filling structure space” has been completed. 

It is therefore clear that, within a remarkably short 
period of five years, tremendous strides have been 
made in developing HTP protocols for every step in the 
structure determination pipeline. For the low-hanging 
fruit, this has resulted in a dramatic reduction in time 
and cost for determination of high-resolution struc- 
tures. In addition to the structures determined, these 
initiatives have also resulted in creation of large banks 
of knowledge and materials that are available to the 
scientific community (138). This includes novel soft- 
ware tools, cloning and expression tools and vectors, 
and many purified proteins that can be used for a range 
of purposes, for example, producing antibodies and 
protein arrays (64), and most importantly, for func- 
tional analyses. However, despite these rapid advances, 
all the techniques currently in use are still empirical 
in nature. No clear predictive rules have emerged that 
can guarantee protein production leading to structure 
determination, although relationships between con- 
tact order and thermostability (147) and biophysical 
properties versus crystallization have emerged from 
structural studies (31). Essentially, each of these steps 
depends on the unique chemistry of each individual 
protein. The field is therefore at a stage of having a 
vastly enhanced, but nonetheless still brute force, ap- 
proach to structural genomics. As the available data 
expand, rules for predicting how well an individual 
ORF will behave at each stage of the process should 
ultimately become apparent. In the new U.S. Protein 
Structure Initiative, HTP production centers will com- 


bine and optimize the tools that have been developed 
for rapid screening and production of targets, and or- 
thologous genes will be used for recalcitrant targets; 
this will be facilitated by technology that uses a large 
degree of automation for very rapid cloning and screen- 
ing. The smaller research-based centers will tackle the 
much more difficult to express, high-hanging fruit, in 
particular, represented by membrane and posttransla- 
tionally modified proteins (138, 159). 


PERSPECTIVE: THE NEXT FIVE YEARS 


As “omics” tools become more accessible to the 
research community for the study of archaeal biology, 
significant progress in the confident annotation (un- 
equivocal evidence of functional/structural charac- 
teristics ORFs/genes/proteins) of many of the archaeal 
genomes will likely result in the next five years. Cou- 
pled with useful genetic systems (see Chapter 21), 
strategic use of functional genomics approaches will 
form the basis for complete and accurate genome an- 
notation. Archaeal genome sequence data will swell 
due to the rapidly falling sequencing time and costs. 
One outcome of this is the prospect of sequencing and 
functional genomics analyses of environmental sam- 
ples, studies focusing on community interactions in 
situ, and comprehensive insights into archaeal ecol- 
ogy of extreme environments (180). This could pro- 
vide clues to the roles that seemingly uncultured and 
“unculturable” archaea play in various ecosystems. 
One of the current frontiers that may come into clearer 
focus is the relationship between the membrane tran- 
scriptome and proteome in archaea; how the unique 
archaeal membrane composition and structure im- 
pacts this remains to be seen. While the archaeal bi- 
ology research community will no doubt benefit from 
generic advances in functional genomics tools, it will 
be important to develop genetic systems that comple- 
ment the use of these tools. Information gained from 
ongoing and emerging functional genomics efforts fo- 
cusing on archaea will provide clues and insights that 
will accelerate the genetic system’s development. Ar- 
chaea have also played an important role in the re- 
cently defined field of structural genomics. In the past 
five years, several species have been the targets of 
such endeavors. However, that field is now more di- 
rected toward representatives of protein families rather 
than specific organisms. Consequently, in the next five 
years, the inclusion of genes from archaea and the 
subsequent determination of protein structures will be 
more by accident than by design. In contrast, it is clear 
that the use of proteomic and transcriptomic tools is 
at an early stage in the study of archaeal biology. As 
analytical techniques become more robust and exquis- 
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ite, a comprehensive cataloging of the functional com- 
ponents of the cell at a molecular level, coupled with 
the functionality elucidated by “omics” protocols, 
will give amazing new insights into archaeal biology. 
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Chapter 21 


Molecular Genetics of Archaea 


KEVIN SOWERS AND KIMBERLY ANDERSON 


INTRODUCTION 


The Archaea are a phylogenetic lineage of mi- 
croorganisms genotypically distinct from the Bacteria 
and Eucarya (see Chapters 1 and 19). These microor- 
ganisms are morphologically similar to monocellular, 
nonnucleated bacteria and possess many biosynthesis 
pathways typically found in bacteria (see Chapter 2). 
In contrast, nucleic acid processing, including both 
transcription and translation, in general occurs by 
mechanisms similar to those in eucarya (see Chapters 6 
and 8). Like the Bacteria this domain of microorgan- 
isms exhibits a wide range of morphological and phys- 
iological diversity, which includes heterotrophs and au- 
totrophs, acidophiles and alkaliphiles, psychrophiles 
and thermophiles, high-solute-intolerant non- 
halophiles and halophiles. However, many archaea re- 
quire extreme conditions for growth. These include 
halophiles that require saline concentrations up to 
saturation, methanogens that require highly reduced 
anoxic environments poised below —350 mv, and 
hyperthermophiles that grow at temperatures up to 
113°C. The unique characteristics of these micro- 
organisms have largely precluded attempts to apply 
bacterial or eucaryal genetic protocols to members of 
this domain. Traditional colony growth of many of the 
fastidious anaerobes in traditional anaerobic roll tubes 
is not as practical as plating for screening large num- 
bers of clones; unique cell wall structures prevent the 
use of commonly used antibiotic genetic markers that 
target cell wall synthesis; bacterial gene promoters as- 
sociated with many of the commonly used genetic 
markers are not recognized by the archaeal transcrip- 
tional apparatus; bacterial plasmids and phages will 
not replicate in archaeal species. In recent years sev- 
eral laboratories have overcome these difficulties by de- 
veloping effective plating techniques, identifying ge- 
netic markers that do not target cell wall synthesis, 


fusing archaeal promoters with recombinant genes, 
and isolating native archaeal vectors and identifying 
promiscuous nonarchaeal vectors. Gene transfer sys- 
tems now exist for species within all three physiologi- 
cal groups of archaea, halophiles, methanogens, and 
nonmethanogenic hyperthermophiles, and develop- 
ment of these systems has been reviewed (3, 72, 94, 
103). Despite varying degrees of difficulty applying 
these protocols they are routinely used by laboratories 
conducting research on archaeal genetics and can be 
mastered by anyone with a fundamental knowledge of 
microbial genetic techniques. 


MEDIA AND GROWTH 


One of the initial challenges faced by investiga- 
tors in the early development of archaeal genetic sys- 
tems was the difficulty associated with growing many 
members of this domain on solidified medium for se- 
lection of recombinant clones. Fortunately, tech- 
niques have been developed to obtain efficient surface 
colonization for species of both the Crenarchaeota 
and Euryarchaeota (Table 1). 

Halophilic euryarchaeota are grown on standard 
agar-solidified medium supplemented with 12 to 25% 
NaCl and Mg?* salts, which serve as osmolytes for 
growth (91). As these microorganisms are aerobic 
mesophiles, they do not require specialized equipment 
used for growing hyperthermophiles and anaerobic 
methanogens. Because of the ease of growing these 
extremophiles, methods for genetic manipulation of 
haloarchaea have been developed and standardized 
over the past twenty years (see reference 30). 

Methods have been developed for efficient 
growth of fastidiously anaerobic methanogens on 
conventional methanogen medium solidified with 
agar in Petri dishes, including Methanobacterium spp. 
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Table 1. Physiological characteristics and growth efficiency of archaea on solidified medium 


Organism Salinity (%) Growth temp (°C) Aerobic/anaerobic Mean efficiency’ (%) Reference 

Halophiles 

Halobacterium spp. 20-25 40-50 Facultative anaerobe 100 91 

Haloferax volcanii 12.5-20 40-50 Aerobe 100 91 

Haloarcula spp. 20 40-50 Aerobe 100 91 
Methanogens 

Methanococcus spp. 2-3 37 Strict anaerobe 90 54, 58, 74 

Methanosarcina spp. 2-3 35 Strict anaerobe 55 6, 74 

Methanothermobacter spp. 0.6 37 Strict anaerobe 80-100 43, 58 

Methanobrevibacter spp. 0.6 30-37 Strict anaerobe 90-100 43, 58 
Thermophiles 

Sulfolobus spp. 0.01 65-78 Aerobe 100 65 

Pyrobaculum spp. 1.5 92 Facultative anaerobe 100 110 

Pyrococcus spp. 2-4 80-95 Strict anaerobe ND 34 

Thermococcus celer 2-4 80-95 Strict anaerobe ND 34 


aND, not determined. 


and Methanobrevibacter spp. (43, 58), Methanococ- 
cus spp. (54, 74) and Methanosarcina spp. (6, 31, 52, 
54, 74, 108) (Table 1). However, owing to the re- 
quirement by methanogens for both anoxic and 
highly reduced growth conditions, solidified medium 
must be prepared and inoculated in an anaerobic 
glove box. Anaerobic medium containing agar is 
melted in sealed serum vials outside the glove box, 
then transferred into the glove box and poured into 
Petri dishes. Although methanogens will tolerate the 
oxygen concentrations in an anaerobic glove box (<1 
ppm) for purposes of inoculation, they will not grow 
without a reduced atmosphere. Inoculated media 
must be transferred to a gas-tight vessel or chamber 
containing a reduced atmosphere generated by hy- 
drogen sulfide for incubation and growth (6, 108). 
The long incubation periods of 5 to 30 days require 
that the jars selected are not wholly composed of 
plastics or other polymers, which are slightly perme- 
able to oxygen (104). Metal culture jars are available 
from commercial sources (Medica Instrument Mfg. 
Co., Don Whitley Scientific Ltd.), and they can be 
constructed from a modified pressure cooker (11), 
paint pressure canister (108), canning jar (6), or glove 
box airlock (74) or custom-made by a machine shop 
(10). Several commercially manufactured glass and/ 
or metal anaerobe jars suitable for growing methano- 
gens from TORBAL, Oxoid, and BBL are no longer 
available but often can be found in storage or through 
used-equipment distributors. 

In addition to a reduced atmosphere the mois- 
ture content of the medium is often a critical factor. 
If the plates are too moist, the closed environment of 
the anaerobe jar results in confluence of the colonies 
due to surface condensate; too low a moisture content 
is inhibitory to many osmotically sensitive species 
with S-layer cell walls. The moisture content of the 


plates is controlled by preincubating the plates in an 
anaerobic glove box at a known relative humidity for 
a set period of time, then storing the plates in a sealed 
container until they are inoculated. Both the hydro- 
gen sulfide concentration and moisture content re- 
quired for optimal plating efficiencies must be deter- 
mined empirically for each species. Another factor 
that can critically affect plating efficiency is exposure 
of the cells to trace amounts of oxygen during plat- 
ing in the anaerobic glove box. When cells are spread 
onto the surface of the solidified medium they are ex- 
tremely sensitive to oxygen as they are no longer pro- 
tected by the reduced medium. Glassware and spread- 
ing implements should be equilibrated in the glove 
box overnight to remove adsorbed oxygen. Care must 
be taken to ensure that the chamber atmosphere is at 
its minimum oxygen tension by using freshly charged 
palladium catalyst and by not opening the gas-lock 
for a period of time before inoculation to allow the 
chamber atmosphere to become anoxic. Inoculating 
cells in a molten top agar (0.5% wt/vol in medium) 
will often improve colonization efficiencies by pro- 
tecting the cells from trace oxygen exposure (58). The 
growth efficiency of Methanosarcina mazei G61, 
which grows poorly on agar-solidified medium, was 
improved by selection for a spontaneous mutant that 
yielded greater plating efficiencies (31). 

Colonization of methanogenic and nonmethano- 
genic hyperthermophiles on solidified medium is 
equally problematic, as agar is rapidly dehydrated at 
high temperatures, especially at the concentrations re- 
quired for it to remain solidified. Therefore, gellan 
gum (Gelrite) is used as the solidifying agent for 
growth of thermophiles and hyperthermophiles, 
which are incubated in a plastic bag or anaerobe jar 
to minimize dehydration (34, 65). Conventional poly- 
styrene disposable Petri plates will melt above 60°C; 
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therefore, glass or higher-temperature-resistant poly- 
ethylene plates (Greiner Bio One International AG) 
are used for thermophiles. Growth medium is inocu- 
lated at room temperature, and then the medium is 
incubated at the desired temperature. For anaerobic 
hyperthermophiles media are prepared and inocu- 
lated by the methods described for methanogens. 


STRAIN CHARACTERISTICS 


Archaea generate cell walls that range from pli- 
able paracrystalline protein and glycoprotein S-layers 
to rigid forms that include a pseudomurein mono- 
layer analogous to bacterial gram-positive cell walls 
and heteropolysaccharide with chemical properties of 
chondroitin (56) (see Chapter 14). Most of the species 
for which tractable gene transfer systems have been 
developed have cell walls composed of a paracrys- 
talline S-layer rather than a rigid cell wall (Table 1). 
For DNA-uptake procedures, spheroplasts are gener- 
ated by suspending S-layer cells in a Mg”*-free su- 
crose buffer, in some instances with the addition of 
EDTA, which disrupts the integrity of the S-layer pro- 
tein subunits (35). After transformation, the S-layer is 
regenerated by resuspension of the cells in medium 
that contains divalent cations. 

Genetic systems have been developed for ar- 
chaeal species with rigid cell walls. Methanosarcina 
spp. grown in low-saline medium synthesize a rigid 
chondroitin-like (methanochondroitin) outer layer in 
addition to a glycoprotein S-layer, which causes cells 
to grow as multicellular aggregates. For genetic stud- 
ies these species are adapted to, and grown at, ma- 
rine saline concentrations, which prevents synthesis 
of methanochondroitin, enabling the cells to be grown 
as individual S-layer cells (101). Spheroplasts of these 
cells can be generated in Mg?*-free sucrose buffer for 
transformation, and the S-layer can be regenerated 
as described above (73). Methanothermobacter spp. 
have a rigid pseudomurein cell wall, and spheroplasts 
can be generated with a pseudomurein endopeptidase 


from Methanothermobacter wolfei. Although the 
spheroplasts remain actively methanogenic in a su- 
crose buffer, methods for regenerating a cell wall have 
not yet been developed, which currently limits the 
tractability of these species for genetics (57, 77). 
Besides cell structure, cell physiology has also in- 
fluenced the development of genetic systems for the 
Archaea. Most of the currently available genetic sys- 
tems were developed for mesophilic and moderately 
thermophilic archaea (Table 2). In the extreme tem- 
perature ranges there has been limited success with 
hyperthermophilic archaea (2, 67), and there are no 
published reports for psychrophilic archaea. 


SELECTABLE GENE MARKERS 


Vectors have been developed for halophiles, meth- 
anogens, and nonmethanogenic hyperthermophiles 
(Table 3). Plasmids isolated thus far from archaea do 
not possess phenotypes that can be readily used for se- 
lection; the unique physiology of archaea precludes the 
use of many common genetic markers. As examples, 
antibiotics that act by disrupting cell wall synthesis are 
ineffective because archaea lack bacterial cell walls; 
high temperatures used to grow hyperthemophiles de- 
grade many antibiotics; high temperatures or high-salt 
concentrations degrade many of the gene products 
that detoxify the selection marker; in the case of 
methanogens, chloramphenicol effectively inhibits pro- 
tein synthesis, but the product of chloramphenicol 
acetyltransferase inhibits methanogenesis (14). How- 
ever, there are several archaeal and orthologous genes 
from bacterial and eucaryal systems that have been 
used as effective genetic markers in the Archaea, and 
these are described below. 

The high intracellular salt concentration found 
in the moderate and extreme halophiles will denature 
proteins generated by many nonhalotolerant resis- 
tance loci currently in use. This problem has been cir- 
cumvented by the isolation of a genetic determinant 


Table 2. Gene transfer methods for archaea 


Type Method Species Reference 
Transformation Polyethylene glycol (PEG)-mediated Haloferax volcanii, Haloarcula spp., 24-26, 67, 109 
Halobacterium spp., Methanococcus 
maripaludis, Pyrococcus abyssi 
Electroporation Methanococcus voltae, Sulfolobus spp. 22, 83, 98 
Liposome-mediated Methanosarcina spp. 73 
CaCl, with heat shock Sulfolobus solfataricus, Pyrococcus furiosus 2 
Transduction WM1-mediated Methanothermobacter thermautotrophicus 71 
Conjugation Cell mating Haloferax volcanii, Sulfolobus acidocaldarius 39, 92 


Plasmid mediated 


Sulfolobus solfataricus 97 


Table 3. Shuttle vectors for use in archaea and E. coli 


Typeoi Vector Host organism(s) Genetic marker(s)* Size (kb) Characteristics Reference 
organism 
Halophiles pHRZ Halobacterium spp. Ani/Thi 12.6 Unknown origin 69 
of replication 
pMDs20 Haloferax volcanii Mev/Amp 10 Derived from 47 
pMDS10 
pMLH3 Haloferax volcanii Nov/Mev/Amp 11.3 Contains 2 selective 47 
markers (Nov/Mey) 
pMPKS4, 62 Halobacterium Mev/Amp 8.6, 10.2 Used for expression 60 
salinarium of bacteriorhodopsin 
gene, bop 
pNB102 Haloarcula spp., Mev/Amp 9.1 Derived from pNB101 120 
Halobacterium spp. 
pNG168 Halobacterium spp. Mev/Amp 8.9 Multiple cloning sites; 30 
blue/white screening 
in E. coli 
pTA230-233 Haloferax volcanii Pyr, Trp, Leu, 7.4, 7.7, Blue/white screening in 4 
Hdr/Amp 7.8, 7.4 E. coli; requires 
ApyrE2, AtrpA, 
AleuB, or AhdrB host 
pUBP2 Halobacterium Mev/Amp 12.3 Derived from pHH1 18, 25 
salinarium, from Halobacterium 
Haloferax volcanii 
pWL102, Haloferax volcanii, Mev/Amp 10.5,8.7 Derived from pHV2 18, 25, 62 
pWL104 Haloarcula spp. from Haloferax 
pWL-Nov Haloferax volcanii Nov/Amp 9.4 Derived from pWL102 82 
pWL204 Haloferax volcanii Mev/Amp 10.4 Expression vector based 81 
on pWL102 
Methanogens pDTL44 Methanococcus Pur/Amp 12.7 Used for gene expression 107 
maripaludis and complementation 
pEA103 Methanosarcina Pur/Amp 12.4 Based on pWM313; used 5 
acetivorans for gene expression; 
requires E. coli pir 
host for replication 
pJK89 Methanosarcina Pro/Amp 91. Contains proC from 89 
acetivorans E. coli; used for 
complementation 
of AproC host 
pPB31-35 Methanosarcina PA/Amp 11.3 Blue-white screening in 19,119 
acetivorans E. coli; requires 
E. coli pir host 
for replication 
pWLG30 Methanococcus Pur/Amp 12.7 Used for gene expression 36 
maripaludis 
pWM309-321  Methanosarcina spp. Pur/Amp 8.2-8.9 Based on pC2A; multiple 73 
cloning sites; blue/white 
screening in E. coli 
(except pWM321) 
Thermophiles pAG21 Pyrococcus furiosus, Adh/Amp 6.5 More stable in E. coli due 8 
Sulfolobus to low copy number 
acidocaldarius 
pCSV1 Pyrococcus furiosus, Amp 6.1 Based on pGTS from 2 
Sulfolobus P. abyssi 
acidocaldarius 
pEXADH Sulfolobus solfataricus Hyg, Adh/Amp The Contains 2 selective 29 
markers (Hyg, Adh) 
for use in 
S. solfataricus 
pEXlacOP Sulfolobus solfataricus Hyg/Amp 11.7 Contains lacS and lacTr 13 
from S. solfataricus 
pEXSs Sulfolobus solfataricus Hyg/Amp 6.4 Uses SSV1 ORI for 22 
replication in 
S. solfataricus 
pYs2 Pyrococcus abyssi Pyr/amp 6.4 Based on pGTS from 67 


P. abyssi; used with 
ApyrE strains 


Adh, alcohol resistance; Amp, ampicillin; Ani, anisomycin; Hdr, trimethoprim; Hyg, hygromycin B; Leu, leucine; Mev, mevinolin; 
Nov, novobiocin; PA, pseudomonic acid; Pro, proline; Pur, puromycin; Pyr, 5-fluoroorotic acid; Thi, thiostrepton; Trp, trytophan. 
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encoding resistance to the 3-hydroxy-3-methylglu- 
taryl coenzyme A reductase inhibitor mevinolin, 
which is a cholesterol-reducing drug for humans that 
inhibits synthesis of isoprenoid lipid side chains in ar- 
chaea (62). A promoter point mutation leads to over- 
production of the reductase, which is the target of 
mevinolin. The Haloferax volcanii Mev® determinant 
has been shown to confer mevinolin resistance in 
Haloferax spp., Haloarcula spp., and Halobacterium 
salinarium (formerly Halobacterium halobium) (18, 
42, 59, 80, 115). Another haloarchaeal genetic locus 
isolated from spontaneous gyrase mutants of H. vol- 
canii (48) provides resistance to the gyrase inhibitor 
novobiocin. Several hybrid vectors have been con- 
structed that use both markers for transformant 
selection and insertional inactivation (Table 3). Resis- 
tance to the broad-spectrum protein inhibitors anis- 
omycin, sparsomycin, and thiostrepton has been 
reported in H. salinarium using a plasmid-borne, al- 
tered 23S rRNA gene (69, 105), and deletion of di- 
hydrofolate reductase (bdrA) in H. volcanii confers 
resistance to trimethoprim (82). However, these latter 
resistance markers have not been exploited for halo- 
archaeal selection systems. Counter-selectable gene 
knockout systems have been developed with sponta- 
neous 5-fluoroorotic acid-resistant mutants of Halo- 
bacterium spp. and H. volcanii using ura3 and pyrE, 
respectively (4, 17, 85). Auxotrophic mutants have 
also been developed that can be complemented with 
genes encoding for histidine, leucine, thymidine, and 
tryptophan biosynthesis as selectable markers (4, 28, 82). 

Mesophilic methanogenic archaea are sensitive 
to several antibiotics that inhibit protein synthesis, in- 
cluding puromycin, neomycin, and pseudomonic acid 
(20, 88). The most commonly used antibiotic for ge- 
netic selection is puromycin. In this system the struc- 
tural gene from the bacterium Streptomyces alboniger 
that encodes puromycin transacetylase (pac) has been 
used as a selective marker for Methanococcus voltae, 
Methanococcus maripaludis, and Methanosarcina 
spp. As with all archaea the S. alboniger bacterial 
promoter is not recognized and transcribed; there- 
fore, the pac gene is flanked with the constitutively 
expressed methyl coenzyme M reductase (MCR) 
promoter and terminator from M. voltae (plasmid 
pMEB.2) or Methanosarcina barkeri (plasmid pJK3) 
(37, 73). The pac transcription cassette is flanked by 
either EcoRI sites or multiple restriction sites in plas- 
mids pMEB.2 and pJK3, respectively, for excision and 
insertion into other vectors. Another antibiotic resis- 
tance marker for the methanogens utilizes aminogly- 
coside phosphotransferase genes APH3’I and APH3’II 
from bacterial transposons Tn903 and Tn5, respec- 
tively, engineered with flanking archaeal mcr promoter 
and terminator sequences (9). When introduced into 


Methanococcus maripaludis on plasmids, these genes 
confer resistance to the protein synthesis inhibitor 
neomycin at frequencies equivalent to those achieved 
for puromycin resistance with pac. However, these 
markers are ineffective for Methanosarcina spp. be- 
cause these species are not inhibited by neomycin 
(19). Spontaneous mutations in the gene encoding 
isoleucyl t-RNA synthetase, ileS, confer resistance to 
pseudomonic acid by Methanothermobacter therm- 
autotrophicus (53). Pseudomonic acid was developed 
also as a selectable marker in M. barkeri by directed 
mutagenesis of the methanosarcinal ileS (19). Al- 
though this is a tractable genetic system, pseudo- 
monic acid, also known as mupirocin, is the active in- 
gredient in a topical antimicrobial ointment and is 
currently available in pure form only by request from 
the manufacturer (GlaxoSmithKline). Bacitracin, an 
inhibitor of bacterial murein synthesis, has been re- 
ported to inhibit M. thermautotrophicus, presumably 
by inhibiting synthesis of pseudomurein, but a 
tractable gene transfer system has not been developed 
for this genus (44). In addition to antibiotics, aux- 
otrophic mutants have been developed with defects in 
the biosynthesis of histidine, leucine, and proline (45, 
86, 89). A shuttle vector encoding Escherichia coli 
proC is available as a selectable marker for a Meth- 
anosarcina acetivorans AproC host (89). Disruption 
in the purine salvage pathway gene hpt, which en- 
codes hypoxanthine phosphoribosyltransferase, has 
been used for counter selection with either 8-aza-2,6- 
diaminopurine or 8-aza-hypoxanthine in M. acetivo- 
rans and M. maripaludis (76, 89). 

For both methanogenic and nonmethanogenic 
hyperthermophiles, thermostability of both the marker 
gene product and its target has largely precluded the 
use of mesophilic bacterial and archaeal markers. 
However, several high-temperature-resistant markers 
have been described for use in these microorganisms. 
Sulfolobus solfataricus is sensitive to the thermostable 
antibiotic hygromycin with a low spontaneous rever- 
sion frequency. A plasmid that contains a gene for a 
thermostable mutant form of hygromycin phospho- 
transferase (ph) isolated from E. coli was constructed 
with flanking S. solfataricus aspartate aminotrans- 
ferase promoter and terminator sequences for ex- 
pression in thermophilic archaea (22, 23). Another 
selectable genetic marker is based on butanol 
sensitivity of hyperthermophilic species that lack al- 
cohol dehydrogenase (ADH) (2). The ADH structural 
gene from alcohol-tolerant S. solfataricus, together 
with its promoter and terminator sequences, confers 
resistance when transformed into Sulfolobus acido- 
caldarius and Pyrococcus furiosus (8, 29). Two other 
selectable markers have been developed based on the 
23S rRNA gene of S. acidocaldarius and P. furiosus 
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where a point mutation in one position confers re- 
sistance to caromycin, chloramphenicol, and celes- 
ticetin, and a point mutation at a second site confers 
resistance to thiostrepton (2). However, the rate of 
spontaneous reversion (>1077 ) at either location is 
too high for use as a selection marker and efforts are 
under way to use a 23S rRNA gene possessing both 
point mutations to reduce the probability of sponta- 
neous reversion of P. furiosus grown in the presence 
of both antibiotics. Spontaneous novobiocin-resistant 
mutants of S. acidocaldarius have also been reported 
but must be maintained with novobiocin because of 
high reversion rates (40). Counter-selectable gene 
knockouts have been reported for 5-fluoroorotic acid- 
resistant mutants of Sulfolobus spp., Pyrococcus 
abyssi, and Thermococcus kodakaraensis using pyrE 
and pyrF (40, 55, 67, 70, 96, 112). S. solfataricus 
auxotrophic mutants for the B-linked disaccharide 
lactose (/acS) (117) and the a-linked oligosaccharides 
starch and glycogen (malA) also have the potential 
to be used for targeted mutagenesis. An auxotrophic 
mutant for tryptophan ¢rpE in T. kodakaraensis has 
been reported, but it has not been tested with a selec- 
table marker (96). 


GENE TRANSFER SYSTEMS 
Transformation 


Haloarchaea are transformed via polyethylene 
glycol (PEG) mediated transformation, which was 
first described in H. volcanii (24). This method re- 
quires the formation of spheroplasts generated by re- 
moving the surface glycoprotein layer (S-layer). The 
primary complication encountered with transforma- 
tion in halophilic species is that most halophiles have 
restriction systems that lower the efficiency of the 
transformation (25). However, transformation effi- 
ciencies have been improved significantly by using 
restriction-minus mutants and passing the DNA 
through a Dam* strain of E. coli (50). 

Within the methanogenic archaea, there are only 
two genera for which transformation protocols have 
been developed: Methanococcus and Methanosar- 
cina. M. voltae spheroplasts can be transformed using 
electroporation and can also take up circular and 
linear DNA without any further manipulation (83). 
However, the transformation efficiency for these 
methods is low, at around 10! to 10% transformants 
wg! DNA (83). PEG-mediated transformation is a 
very efficient method for use with M. maripaludis and 
is capable of producing 107 transformants g~t DNA 
(109). In contrast PEG-mediated transformation is 
ineffective for the Methanosarcina species, but lipo- 
some-mediated transformation has proven an effi- 


cient method for transformation (31, 73). Transfor- 
mation via this method is similar to PEG-mediated 
transformation of M. maripaludis, producing >107 
transformants pg”! DNA for Methanosarcina species. 

There are several effective methods for transfor- 
mation of the thermophilic archaea. Similar to M. 
voltae, T. kodakaraensis is capable of taking up DNA 
without any modification or mediation (96). PEG- 
mediated transformation with spheroplasts has been 
shown to be effective for P. abyssi (67). S. solfataricus 
and P. furiosus have both been successfully trans- 
formed with circular DNA using CaCl, treatment fol- 
lowed by heat shock (2). S. acidocaldarius and S. sol- 
fataricus can be transformed with linear and circular 
DNA using electroporation (1, 8, 22, 32, 98). 


Transduction 


Phages and viruses have been discovered within 
all three groups of archaea, but transduction cur- 
rently is not a tractable method for gene transfer. 
Methanothermobacter marburgensis (formerly Meth- 
anobacterium thermoautotrophicum Marburg) has 
been directly transduced with VM1, a phage isolated 
from this organism (71). However, this phage has a 
limited burst size, which makes it impractical for use 
in gene transfer experiments. 


Conjugation 


Gene transfer by conjugation has been docu- 
mented within the halophiles and the hyperther- 
mophiles. In H. volcanii, direct cell-cell contact and 
fusion with the transfer of genetic material has been 
shown (92). The transfer efficiency is stimulated by 
treatment with DNase (75). The DNA transfer is bidi- 
rectional and does not appear to require any plasmid 
or transposon-encoded genes. Cell mating has also 
been reported for two members of the hyperther- 
mophilic archaea, S. acidocaldarius and S. solfatari- 
cus. With S. acidocaldarius, genetic marker exchange 
has been reported between two auxotrophic mutants 
(39). S. solfataricus uses plasmid-mediated conjuga- 
tion requiring direct cell-cell contact in which the 
conjugative plasmid pNOBS is transferred unidirec- 
tionally (97, 99). 


GENE VECTORS 


Hybrid vectors contain both bacterial and ar- 
chaeal components that enable them to be replicated 
and selected in both systems (Fig. 1). A large number 
of shuttle vectors are available for the halophiles in 
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eS 


Figure 1. Recombinant plasmid showing construction typical for 
an E. coli/archaeal shuttle vector. The construct includes the pir- 
dependent R6K ori for replication of the plasmid in pir* E. coli, 
the bla gene for selection of E. coli transformants with ampicillin, 
pC2A ori and repA for replication in Methanosarcina spp., and pac 
under transcriptional control of the archaeal mcrB gene for selec- 
tion of methanosarcinal recombinants on puromycin. pC2A, au- 
tonomously replicating plasmid pC2A from M. acetivorans; pac, 
puromycin N-acetyltransferase flanked by the methylCoM reduc- 
tase archaeal promoter (pmcr) and terminator (tmcr); oriR6K, pir- 
dependent R6K origin of replication; bla, B-lactamase; mcs, mul- 
tiple cloning site in the lacZ gene encoding -galactosidase for 
blue-white screening of DNA insertion. 


comparison to the methanogens or the thermophiles 
(Table 3). H. volcanii contains a naturally occurring 
miniplasmid, pHV2. Using the origin from this vec- 
tor, shuttle vectors for halophilic archaea were devel- 
oped containing the gene encoding mevinolin resis- 
tance for selection within the halophile and a separate 
resistance gene for selection within E. coli (62). The 
shuttle vectors derived from pHV2 can be used in 
H. volcanii as well as Halobacterium spp. Other shut- 
tle vectors for use in the halophiles have also been de- 
veloped from naturally occurring plasmids, such as 
pNRC100, pHH1, and pGRB1 (18, 59, 78, 79). 
Within the methanogens, shuttle vectors for 
Methanosarcina spp. have been constructed from the 
naturally occurring plasmid pC2A from M. acetivo- 
rans C2A (73, 102). These vectors contain the origin 
of replication from pC2A, a ColE1 origin for repli- 
cation in E. coli, the pac gene for puromycin resis- 
tance within the methanogen, and a gene encoding 
ampicillin resistance for selection in E. coli (Fig. 1). 
The pWM family of plasmids, which are all deriva- 
tives of pC2A, are capable of transformation into all 
Methanosarcina species (31, 73). Shuttle vectors for 
M. maripaludis have been constructed using the cryp- 


tic plasmid pURB500 although, unlike derivatives of 
methanosarcinal pC2A, these vectors can only be 
used with M. maripaludis (107). 

There is a relative dearth of shuttle vectors avail- 
able for use within the hyperthermophilic archaea. 
One type of shuttle vector currently used is based on 
pGTS, an autonomously replicating plasmid from 
P. abyssi GE9 (33). These plasmids can be used in 
both P. abyssi and S. acidocaldarius (2, 8, 67). An- 
other shuttle vector uses a replication origin from Sul- 
folobus phage SSV1 (22). This vector, pEXSs, con- 
tains the bph gene for selection of transformants of 
S. solfataricus with hygromycin. 


GENE DELETION SYSTEMS 


Random Mutagenesis 


Gene disruption is required to identify and con- 
firm the function of genes within the archaea. Ran- 
dom mutagenesis using chemical and UV radiation 
has been successfully used for H. volcanti, M. voltae, 
M. maripaludis, and P. abyssi (16, 61, 75, 113) (also 
see Chapter 5). However, a significant problem with 
these approaches is determining the mutation loci. A 
transposition system has been developed for M. ace- 
tivorans using a modified mariner-family Himar1 
transposon that utilizes only the cognate transposase 
expressed from methanosarcinal promoters (118). 
The transposon elements flank both archaeal and 
bacterial selection markers and an E. coli origin of 
replication. After transformation of the vector into 
the archaeon, transposed genomic DNA is restriction 
digested, ligated, and propagated as a plasmid by 
transformation into E. coli. The mutated gene is read- 
ily identified by sequencing the genomic DNA flank- 
ing the transposon elements. By use of this approach, 
mariner-induced M. acetivorans mutants were iden- 
tified for genes that encode proteins for heat shock, 
nitrogen fixation, and cell wall structures (118). 


Targeted Gene Knockout Systems 


Gene knockout systems that disrupt specific loci 
have been developed for all three groups of archaea. 
These systems typically include disruption with reten- 
tion of a genetic marker and “markerless systems” by 
which the genetic marker is removed after disruption 
by a second homologous recombination event, en- 
abling reuse of the markers for subsequent deletions 
(Fig. 2). Within the halophilic archaea, a commonly 
used method is based on restoration of uracil pro- 
totrophy in a ura mutant or pyrE2 strain as a genetic 
marker (4, 17, 85, 111). Circular nonreplicating plas- 
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Figure 2. Gene disruption methods that are used in archaeal genetics. (a) Direct replacement of a gene with a selectable marker occurs by recombination between 
linear DNA flanked by regions of the target gene and the wild-type chromosomal gene. (b) The “pop-in pop-out” method uses circular DNA and selection for 
transformation to uracil prototrophy using a ura~ strain (17, 85, 96). Recombinants that have lost the plasmid are counter-selected using 5-fluoroorotic acid 
(5-FOA), which inhibits growth of ura* cells. Deletion mutants must be screened by Southern hybridization. (c) A variant of the “pop-in pop-out” method for 
gene deletion utilizes a genetic marker for gene disruption that allows direct selection (4). (d) Another variant used for generating point mutations employs 
gene replacement with a ura marker and subsequent replacement of the ura disrupted target gene with a gene containing the desired point mutation (85). Mutants 
with the desired point mutation are counter-selected with 5-FOA. Reprinted from Nature Reviews Genetics (3) with permission of the publisher. 
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mids that contain the ura3 or pyrE2 genes and the dis- 
rupted target gene are integrated into the target gene 
loci by homologous recombination and selected in 
uracil-free medium. Counter-selection with 5-fluo- 
roorotic acid selects for clones that have lost the ura3 
or pyrE2 gene and either lost the wild-type or mutated 
the target gene by a second homologous recombina- 
tion event. Recombinant strains must then be screened 
by Southern analysis or PCR to identify clones with 
single copies of the disrupted gene. A modification of 
this method utilizes the gene encoding mevinolin re- 
sistance (mev?) to disrupt the target gene, which al- 
lows direct selection of the disrupted clone by adding 
mevinolin to the medium (4). Another variation em- 
ploys insertion of ura3 by homologous insertion into 
the genomic target gene followed by substitution of 
the genomic target gene with a copy of the gene with a 
point mutation on a plasmid (85). The point mutants 
are counter-selected with 5-fluoroorotic acid. H. vol- 
canii AleuB and AtrpA mutants are also available for 
positive selection of gene deletions (4). 

Within the methanogenic archaea, the gene 
knockout systems that were developed for Methano- 
coccus spp. and Methanosarcina spp. utilize selective 
markers for puromycin, pseudomonic acid, and neo- 
mycin resistance (9, 19, 37). The simplest means of 
gene disruption is by homologous recombination 
with the genetic marker distally flanked by sequences 
from the target gene on a nonreplicating shuttle vec- 
tor or linear DNA fragment. Mutants are then se- 
lected by growing the transformant with the corre- 
sponding antibiotic. This approach has been used in 
Methanococcus spp. by disruption with neomycin 
(APH3'T/II) and puromycin (pac) resistance markers 
(12, 46, 63, 64, 66, 87, 106, 116) and in Methano- 
sarcina spp. with pac and pseudomonic acid (ileS) re- 
sistance markers using homologous recombination 
(41, 68, 114) and transposon insertion (119). Di- 
rected gene disruption is also possible by comple- 
menting histidine auxotrophy in a hisA mutant strain 
of M. voltae and proline auxotrophy in a proC mu- 
tant strain of M. acetivorans (15, 119). 

A disadvantage of gene disruption with an an- 
tibiotic or by complementation of an auxotrophic 
mutant is that the genetic marker cannot be reused 
for multiple disruptions. As an alternative, a marker- 
less gene disruption system for the methanogens has 
been developed using a mutant defective for hypox- 
anthine phosphoriboxyltransferase (bpt), an enzyme 
involved in the purine salvage pathway, which is re- 
sistant to inhibition by the purine base analog 8-azo- 
2,6-diamino-purine (8-ADP) (21, 89, 76). In the first 
step of disruption, wild-type hpt and pac flanked by 
target DNA is inserted into the wild-type gene by 
homologous recombination to create an unstable 


merodiploid that is selected for puromycin resistance 
and screened for resistance to 8-ADP. In the second 
step, selected transformants are grown under nonse- 
lective conditions without puromycin to promote 
plasmid excision, which creates a mixed population 
of clones with either the wild-type gene or a disrupted 
gene. However, clones with the mutant allele must be 
selected by screening for puromycin sensitivity either 
by Southern hybridization or PCR. 

A modified markerless disruption method has re- 
cently been devised that eliminates the requirement 
for screening (Fig. 3) (94). This method first involves 
disruption of the target gene by a double-recombina- 
tion event with a linear DNA fragment containing the 
pac-hpt cassette, flanked by flp recombinase recogni- 
tion sites and sequence from the target gene. Insertion 
of this construct yields a gene disruption that is se- 
lected by puromycin resistance and 8-ADP sensitivity. 
The mutant is then transformed with a nonreplicating 
plasmid that encodes the gene for flp recombinase 
under archaeal promoter control, which, when ex- 
pressed, will delete the region between the flp recog- 
nition sites, generating a deletion mutant that is 
puromycin sensitive and 8-ADP resistant. As an alter- 
native to gene deletions, conditional gene inactivation 
has also been used to test for essential genes (93). In 
this approach the target gene is fused to a highly reg- 
ulated promoter, and the effects of this fusion are as- 
sayed under expressing and nonexpressing growth 
conditions. This approach has also been proposed for 
identification of trans-acting regulatory factors that 
control multiple genes in a regulatory system (94). 

Gene disruption protocols exist for two species of 
thermophilic archaea. A JacS mutant of S. solfataricus 
created through transposon-mediated mutagenesis is 
disrupted by lacS gene distally flanked by sequences of 
the target gene via homologous recombination. T: ko- 
dakaraensis genes can also be disrupted via recombi- 
nation, but this method has been modified to allow for 
the reuse of the selective marker (95, 96). Using either 
pyrF or trpE as selective markers with endogeneous 
and exogenous sequences that flank the selective 
marker gene and act as tandem repeats, the genetic 
marker is excised via recombination under nonselec- 
tive growth conditions. Subsequent deletions can be 
made in the mutant strain using the same marker. 


GENE REPORTERS 


Several phenotypic markers are available for the 
archaea that have been exploited largely for in vivo 
gene expression studies rather than genetic markers. 
A salt-tolerant B-galactosidase from Haloferax ali- 
cantei has been expressed in Halobacterium spp. and 
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Figure 3. Markerless disruption method that employs Flp recombinase. An artificial operon that expresses puromycin N-acetyl- 
transferase (pac) and hypoxanthine phosphoribosyltransferase (htp) is flanked by Flp recombinase recognition sites (RP1 and 
RP2) and regions homologous to the target gene. The linear DNA is transformed into an M. acetivorans Ahpt strain that is 
resistant to 8-aza-2,6-diamino-purine (8-ADP). The target gene is replaced by homologous recombination, and recombinants 
are selected by resistance to puromycin. The deletion mutant is subsequently transformed with the nonreplicating plasmid 
pMRSS encoding Flp recombinase, which removes the pac-hpt operon by site-specific recombination between RP1 and RP2. 
Reprinted from Current Opinion in Microbiology (94) with permission of the publisher. 


H. volcanii, which lack detectable B-galactosidase 
background activity (38, 49, 51, 84). A modified de- 
rivative of green fluorescent protein has also been ex- 
pressed in H. volcanii as a gene reporter (90). A phe- 
notype with potential for use in genetic studies is 
the purple color associated with bacteriorhodopsin in 
H. salinarium, which is encoded by the bop gene. 
When present on a plasmid, bacteriorhodopsin pro- 
duction is capable of complementing a bop insertion 
mutant, which is detected by purple-colored colony 
formation (59). Constructs containing functional bop 


genes are potentially useful not only for studies em- 
ploying bop mutant strains but also when propagated 
in naturally occurring bop-less halophiles (e.g., H. 
volcanii), provided that the organism is capable of 
synthesizing the retinal chromophore required for 
bacteriorhodopsin function or it is supplemented to 
the medium. 

The E. coli uidA (encodes B-glucuronidase) and 
lacZ (encodes ßB-galactosidase) genes and Bacillus 
subtilis treA (encodes trehalase) have been expressed 
as phenotypic markers in the methanogenic archaea, 
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Methanococcus spp. and Methanosarcina spp. (7, 15, 
27, 89, 100). When flanked by archaeal promoter and 
terminator sequences, these three markers can be 
used to quantify archaeal gene expression. Since the 
chromophores are colorless under anaerobic condi- 
tions, the assays are conducted aerobically, or filter 
blots of the transformed colonies must be exposed to 
air to develop the color. 

One phenotypic marker, a thermostable gene en- 
coding lacS from S. solfataricus, has been shown to 
complement mutants in transformed cells (32). How- 
ever, plasmid pNOB8 containing the JacS gene was 
eventually lost in transformants with restored B- 
galactosidase activity because of inadvertent integra- 
tion of lacS into the genome. A plasmid mediated lacS 
reporter has also been developed for directed integra- 
tion into the S. solfataricus genome based on counter- 
selection with pyrEF (55). Two other thermostable 
B-galactosidase genes isolated and cloned from P. fu- 
riosus that exhibit maximum activity at 95°C could 
potentially function as phenotypic markers for hyper- 
thermophilic species (2). 


PERSPECTIVE: THE NEXT FIVE YEARS 


Progress in the development of methodologies 
for archaeal genetics has rapidly accelerated in the 
past decade. Additional methods are currently un- 
der development for all three archaeal phyla, in- 
cluding additional systems for markerless exchange, 
gene expression, topological mapping, protein tag- 
ging and expression, as well as others. The intense 
interest in developing these methods has in large 
part been promoted by the availability of archaeal 
genome sequences. As the earliest archaeal genomes 
became available in the mid-1990s it was recog- 
nized that the lack of tractable gene transfer systems 
for many of these species limited the ability to con- 
firm the function of annotated genes and determine 
the function of genes encoding unidentified pro- 
teins. The choice of archaeal genomes for sequenc- 
ing is now largely driven by the availability of ge- 
netic systems, which at present include complete 
genomes of the halophiles Haloarcula marismortui 
and Halobacterium sp. NRC-1, the methanogens 
M. maripaludis, M. acetivorans, M. barkeri, and 
M. mazei, and the hyperthermophiles S. solfataricus 
P2 and T. kodakaraensis KOD1. The development of 
genetic tools for the Archaea is ongoing, and more so- 
phisticated techniques are currently under develop- 
ment that will enable researchers to use genetic analy- 
sis on a routine basis to understand the unique abilities 


of these microorganisms to proliferate in extreme 
environments. 
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INTRODUCTION 


The exploitation of natural catalysts as whole 
cells or as purified enzymes is the essence of biotech- 
nology and was developed in an effort to obtain en- 
vironmentally friendly and economically convenient 
industrial processes. Nevertheless, several industrial 
processes are damaging for most “conventional” or- 
ganisms and biocatalysts. This may explain why, since 
their discovery in the 1950s, organisms living under 
extreme environmental conditions and the molecules 
extracted from them appeared to provide very 
promising tools for biotechnology. These expecta- 
tions were motivated by the unusual physical condi- 
tions required by extremophiles for growth. 

Extremophiles are mainly bacterial and archaeal 
microorganisms, with the majority from the Archaea, 
although they exist within the three domains of life 
since a eucaryal thermophile has also been reported 
(18). This biodiversity is a key feature that enables the 
isolation of organisms with peculiar or unexpected 
metabolic pathways and enzymes with different char- 
acteristics in terms of substrate specificity, selectivity, 
and reaction mechanism. As a consequence, biotech- 
nologists can screen extremophiles for the most suit- 
able biocatalyst to apply in a particular industrial 
field. The exploitation of the natural diversity of ex- 
tremophiles is one of the missions of the Diversa Cor- 
poration (USA), which is committed to the rapid dis- 
covery of novel microbes isolated from the most 
diverse biotopes (www.diversa.com/). 

Despite these premises and the extensive research, 
the number of extant applications of extremophiles is 
still limited. The reason for this stalling is that their 
commercial use is based not only on their peculiar 
properties but also on several other different criteria, 
including technology integration, intellectual property, 
regulatory compliance, and assurance of supply, that 


must be satisfied. In other words, microorganisms or 
biomolecules can be successfully exploited in novel in- 
dustrial processes if they introduce an innovation that 
outcompetes existing products; this commercial reality 
also applies to the exploitation of extremophiles. 

The biotechnology of the Archaea has been re- 
viewed in recent excellent articles (35, 117). The top- 
ics covered in this chapter include enzymes and mol- 
ecules from archaea and the use of archaeal whole 
cells. Other topics relevant to the biotechnology of ar- 
chaea, such as functional genomics, the development 
of molecular biology tools, and archaeosomes are 
covered in Chapters 20, 21 and 23, respectively. Most 
of the applications described here are common to ex- 
tremophilic bacteria and archaea and have been de- 
scribed in many excellent reviews (9, 20, 45, 50, 123, 
144, 145). The chapter updates the current state of 
biotechnology of the Archaea, paying special atten- 
tion to distinguish between extant and potential ap- 
plications, to provide a realistic overview of the im- 
pact of these organisms on biotechnology. 


ENZYMES FROM ARCHAEA 


The estimated value of the worldwide use of in- 
dustrial enzymes grew from $1 billion in 1995 to $2 bil- 
lion in 2004 and, depending on different estimates, is 
expected to rise $2.4 to 5.1 billion in 2009 (Business 
Communication Company, www.bccresearch.com; 
Freedonia group, www.freedoniagroup.com/world 
-html). Among industrial enzymes, the mesophilic 
ones are often not well suited for the harsh conditions 
adopted in industry because they are rapidly denatu- 
rated at the operational conditions; therefore, extre- 
mophilic industrial enzymes (extremozymes) provide 
interesting alternatives for working at high tempera- 
tures or in the presence of high ionic strength and or- 
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ganic solvents (28, 71). Enzymes from psychrophiles, 
thermophiles, halophiles, and piezophiles usually 
show the same characteristics of the organisms from 
which they have been extracted. With the exception 
of enzymes secreted in the medium, this is not true for 
acidophiles and alkaliphiles, whose enzymes are op- 
timally active at the neutral intracellular pH. Never- 
theless, the extreme acidophile Picrophilus torridus 
maintains an intracellular pH of 4.6, and therefore 
the enzymes from this organism hold particular in- 
terest for biotechnology (43). 

The interest in the application of extremozymes 
resides in their surprising properties. There has been ex- 
tensive research on the structure/function relationship 
in extremozymes, with the aim to uncover the molecu- 
lar mechanisms of stabilization of these biocatalysts, 
and to engineer stabilized mutants of conventional en- 
zymes (reviewed in references 56, 145). Enzymes from 
cold-adapted microorganisms offer considerable poten- 
tial for biotechnological applications (review in refer- 
ences 20, 35, 45). However, very few psychrophilic 
archaea have been isolated (20, 35), and they will 
therefore not be covered extensively in this chapter. 

Among the extremozymes, those from halophiles 
and thermophiles (thermozymes) are, by far, the most 
studied, and their application in biotechnology has 
been exploited for a long time. Enzymes from halo- 
philic bacteria optimally working at high-salt con- 
centrations, at which the water activity is greatly re- 
duced, make them interesting biocatalysts in aqueous/ 
organic and nonaqueous media (1). The general advan- 
tage of thermozymes in industrial applications is their 
outstanding stability to high temperature, proteases, 
and high concentration of organics. These catalysts 
have allowed the development of new processes (e.g., 
the polymerase chain reaction) and have been adopted 
as alternatives to enzymes from mesophiles since their 
prolonged life span reduces the cost of the continuous 
addition of fresh biocatalyst. Additional advantages 
of conducting biocatalysis and biotransformation 
processes at high temperatures include the reduced 
risk of contamination by common mesophilic organ- 
isms, an increased solubility and diffusion coefficient 
of organic compounds, and a reduced viscosity of the 
medium. These features lead to increased reaction rates 
and yield by favoring the extraction of volatile or- 
ganic compounds, the accessibility of recalcitrant sub- 
strates to hydrolysis, and the equilibrium displacement 
in endothermic reactions (74, 88). In addition, their pe- 
culiar characteristics allow easier purification proce- 
dures for the recombinant enzymes (i.e., simple heat- 
ing steps eliminating the unstable proteins of the host), 
resulting in more convenient downstream processing. 

The following sections overview the most promis- 
ing applications of extremozymes, emphasizing the 


archaeal examples. A short survey of examples of en- 
zymes from extremophiles is shown in Table 1. The 
table is not exhaustive as the number of extremo- 
zymes is large. The reader is directed to the papers 
cited in this chapter and a number of excellent re- 
views on these biocatalysts (9, 28, 50, 56, 81, 123, 
143-145), some of which deal specifically with ar- 
chaeal extremozymes (35, 117). 


Hydrolases 
Protease 


Among hydrolases, proteases, esterases/lipases, 
and glycoside hydrolases are important in industry. In 
particular, proteases are the major industrial enzymes 
covering more than 25% of the world market (Busi- 
ness Communication Company, www.bccresearch 
.com) and are extensively used in food, pharmaceuti- 
cal, leather, and textile industries. (Hyper)thermo- 
philic proteases are important components of the de- 
tergent formulations as they improve the cleaning 
ability of the detergent on protein based stains (37). 
The stability of the thermophilic proteases to the 
laundry components (in particular, detergents and al- 
kaline pH) is the main reason for this interest (71). 
Several archaea harbor an abundant number of in- 
tracellular and extracellular proteases, such as Pyro- 
coccus furiosus, which shows more than a dozen of 
such enzymes (26), Sulfolobus solfataricus (49), sev- 
eral thermococcales (42), for which also a patent has 
been filed in the United States (128), and Aeropyrum 
pernix (19). A detailed account of archaeal proteases 
has been described (35), including several articles in 
Methods in Enzymology, vol. 330, 2001. More re- 
cently, proteases working at low temperature in house- 
hold laundry detergents have been used to maintain 
the colors of the clothes and to save energy. There- 
fore, archaeal psychrophilic proteases may also be 
exploited in the near future in laundry detergents. 

Halophilic proteases, such as those from Halo- 
ferax mediterranei (65, 127) and Haloarcula maris- 
mortui (40), are of considerable interest for their abil- 
ity to have greater activity in solvents of increased 
polarity (reviewed in references 81, 123). Under these 
conditions the hydrolytic activity is reduced, allowing 
the exploitation of proteases from halophiles in pep- 
tide synthesis (69, 111). An extracellular protease 
from Halobacterium halobium synthesized glycine 
containing peptides with yields of about 76% (111). 
The activity of the enzyme was maximal in 32% 
(vol: vol) dimethyl formamide while, under the same 
conditions, subtilisin Carlsberg did not exhibit the 
same level of increased activity. 
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Table 1. Applications of extremozymes 
Enzyme Type aa Organism a PHopt Remarks References 
Glycosyl- a-Amylases Starch hydrolysis, Pyrococcus 100 6.5-7.5 Extracellular, 32, 62 
hydrolases brewing, baking, furiosus recombinant 
detergents 
Pyrococcus 100 7.0 Extracellular, 76 
furiosus recombinant 
Pullulanases Production of linear Desulfurococcus 85 5.0 Recombinant 33 
type II small sugars mucosus 
Pyrococcus woesei 100 6.0 Recombinant 119 
Pullulanases Thermococcus 100 6.5 Recombinant 92 
type II aggregans 
Glucoamylases Production of glucose Picrophilus oshimae 90 2.0 - 124 
Picrophilus torridus 90 2.0 - 124 
Thermoplasma 90 2.0 - 124 
acidophilum 
a-Glucosidases Final step of starch Sulfolobus 120 4.5 Recombinant 108 
degradation solfataricus 
Endo-f-glucanase Cotton products Pyrococcus 85 5.6 Recombinant 5 
biopolishing horikoshii 
degradation of Sulfolobus 80 1.8 Recombinant 58 
cellulose and solfataricus 
production of 
cellooligomers 
Xylanases Paper bleaching Sulfolobus 90 7.0 Membrane 17 
solfataricus associated 
a-Xylosidases Sulfolobus 90 5.5 Recombinant 86 
solfataricus 
Chitinases Food, cosmetics, Thermococcus 80 6.0 Extracellular 6 
pharmaceuticals, chitonophagus part of multi- 
agrochemicals component 
enzymatic 
apparatus 
B-Galactosidases Haloferax alicantei 25 7.2 Maximum activity 
in 4 M NaCl 54 
B-Glycosidases Pyrococcus 90 6.5 Membrane bound; 3 
horikoshii recombinant 
Sulfolobus 85 5.4 Recombinant; 99,139 
solfataricus nucleophile 
mutant able 
to synthesize 
oligosaccharides 
Proteases Lysine amino Baking, brewing, Pyrococcus 100 8.0 First KAP from an 129 
peptidases detergents, leather furiosus archaeon 
(KAP) Ion industry 
proteases 
Thermococcus 70 9.0 Recombinant; 42 
kodakaraensis membrane 
bound; ATP- 
independent 
Ion protease 
on unfolded 
protein 
substrates 
Aminopeptidases Haloarcula 40 8.0 Active in 2 M 40 
marismortui KCl 
Cysteine protease Sulfolobus 70 7.5 Intracellular 49 
solfataricus 
Serine protease Aeropyrum 90 8.0-9.0 Extracellular; 19 
pernix K1 recombinant 


Continued on following page 
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Table 1. Continued 
Enzyme Type Be P Organism = PHopt Remarks References 
Thermococcus 85 8.5-9.0 Extracellular; 72 
stetteri 1% SDS- 
resistant 
Staphylothermus 90 9.0 Membrane 83 
marinus associated; 
resistant after 
a 135°C 
treatment 
Esterases/ Carboxylesterase Organic synthesis in Sulfolobus 95 7.0-9.0 Recombinant 
lipases industrial processes solfataricus 
Pyrobaculum 90 7.0 Recombinant 55 
calidifontis 
Archaeoglobus 80 70 Recombinant 80 
fulgidus 
L-Aminoacylase Production of L-amino Thermococcus 85 8.0 Recombinant; 135, 137 
acids from racemic litoralis enantiospecific 
solutions of N-acetyl for L-amino 
amino acids acids 
Alcohol Stereochemistry Sulfolobus 85 8.8-9.6 | Recombinant; 101 
dehydro- solfataricus estreme enantio 
genases stereoselective 
Aeropyrum pernix 95 8.0-10.0 Recombinant 
Pyrococcus furiosus 90 7.0-10.0 Recombinant 141 
DNA Pfu pol In PCR for DNA Pyrococcus furiosus 72 Low processivity; 
poly- fragment up uracil stalling 
merases to 4 kb 
Vents pol - Thermococcus 72 Low processivity; 
litoralis uracil stalling 
Deep Vent pol - Pyrococcus sp. GB-D - Low processivity; 
uracil stalling 
Tgo pol In PCR reaction for Thermococcus 72 Low processivity; 
DNA fragment up gorgoniarus uracil stalling 
to 3.5 kb 
PfuUltra pol Pfu mutant with Pyrococcus furiosus 72 Low processivity; 53 
improved fidelity; uracil stalling 
combined with 
dUTPase (Archae 
Maxx; U.S. patent 
2005,003,401) for 
PCR product up to 
17 kb genomic 
Platinum Pfx pol In PCR reaction for Thermococcus 68 High processivity; 
DNA fragment up kodakaraensis uracil stalling 
to 12 kb genomic 
and 20 kb vector 
DNA ligases Thermococcus 70-80 8.0 First DNA ligase 89, 90 
kodakaraensis from an 
archaeon 
Stratagene Pyrococcus 80 75 Higher melting 
(U.S. patent furiosus temperature 
6,280,998) for ligase chain 
reaction 
Stratagene Pyrococcus 80 ves) Higher melting 
(U.S. patent furiosus temperature for 
6,280,998) ligase chain 


reaction 
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Esterase/lipase 


Esterases and lipases are some of the most ver- 
satile industrial enzymes as they have been exploited 
in the food industry, in lubricants and cosmetic for- 
mulations, in the pulp and paper industry (136, 149), 
and in organic synthesis in which they are the most 
widely used biocatalysts (28). In organic synthesis, es- 
terases/lipases are exploited in hydrolysis, transester- 
ification, alcoholysis, acidolysis, and aminolysis reac- 
tions. The application of these enzymes was extended 
after the discovery of thermophilic versions; for ex- 
ample, thermostable lipases have enhanced the phys- 
ical refining of seed oils by enabling the separation 
of the lysophosphatide from oil at pH 5.0 and 75°C 
(50). Recently, several esterases/lipases have been de- 
scribed from archaea; the enantioselectivity and the 
catalytic efficiency in organic solvents of the esterase 
from S. solfataricus has been described in detail (120- 
122), and, more recently, three open reading frames 
(ORFs) producing enzymes with esterase activity and 
an additional phosphotriesterase have been cloned 
from the same organism (70, 85). Other enzymes have 
been identified in Pyrobaculum calidifontis (55), P. fu- 
riosus (60), and Archaeoglobus fulgidus (80) for which 
the three-dimensional structure of the carboxyl-esterase 
is available (29). Patents of esterase enzymes from 
various archaeal genera (Pyrodictium, Archaeoglobus, 
Thermococcus, and Sulfolobus) have been filed (106), 
but no extant application of the enzymes is known. 


Glycosidases 


Industrial glycoside hydrolases are widely used 
for the hydrolysis of starch and B-polysaccharides. 
Starch is composed of a-glucose units linked by 
a-1,4- and a-1,6-glucosidic bonds. The linear poly- 
mer of amylose consists of a-1,4-linked glucopyra- 
nose residues, and amylopectin has additional a-1,6- 
linked branch points every 17 to 26 glucose units 
along the linear polymer. Approximately 1.4 X 10? 
tons of starch is produced every year (67), and it is 
the main source of sugar for the food industry. One of 
the largest markets of starch enzymatic hydrolysis is 
the production of glucose/fructose syrups for soft 
drinks. Starch conversion can also produce anticari- 
ogenic sugars, such as isomaltooligosaccharides (67), 
antistaling agents in baking (46), and cyclic glucans 
(cyclodextrins) that have applications as carriers of 
small molecules, artificial protein chaperones, and 
thermoreversible starch gels (67). 

Enzymatic hydrolysis of starch occurs through 
liquefaction, which includes gelatinization (swelling of 
starch at 105 to 110°C, pH 5.8 to 6.0) and saccharifi- 


cation (production of maltodextrines by an a-amylase 
at 95°C for 2 to 3 h). After liquefaction, glucose and 
maltose syrups are produced by the combined action 
of pullulanase with glucoamylases or B-amylases. In 
these processes, temperature and pH control is criti- 
cal: gelatinization below 105°C produces incomplete 
swelling, making the refining of starch difficult, while 
above 105°C the enzymes become inactivated reduc- 
ing the yields of saccharification. Similarly, a pH that 
is higher or lower than 5.5 can lead to by-products 
and unwanted color formation (145). To avoid these 
problems, pH adjustments are required before and af- 
ter the liquefaction step, thereby increasing the cost of 
chemicals. Termamyl and Fungamy] are commercially 
available enzymes from nonarchaeal, moderately ther- 
mophilic organisms that are used in the degradation 
of starch. Their activity depends on calcium, and the 
formation of calcium oxalate can damage the indus- 
trial plant (50). An engineered version of Termamyl, 
(Termamyl LC) greatly reduces the amount of cal- 
cium that is required (50). Extremophilic amylases 
active and stable at 100°C, pH 4.0 to 5.0, and not re- 
quiring calcium, would be well suited for making 
starch conversion more economical. In addition, ther- 
mostable pullulanases, B-amylases, and glucoamylases 
could be used in a “one-pot” strategy during starch 
biodegradation. 

Hyper/thermophiles, including members of the 
Archaea, synthesize amylolytic enzymes to hydrolyze 
glycogen. The enzymes include a-amylases, glucoamy- 
lases, a-glucosidases, and pullulanases (reviewed in 
reference 12). Enzymes that have potential biotechno- 
logical application include an a-amylase from Pyro- 
coccus woesei that does not require calcium for activ- 
ity and stability (41), and a novel type ITI pullulanase 
from Thermococcus aggregans that differs from type 
I and type II enzymes on the basis of its substrate 
specificity. This type II] enzyme simultaneously at- 
tacks a-1,4- and a-1,6-glycosidic linkages in pullulan, 
producing a mixture of maltotriose, panose, maltose, 
and glucose (92). Glucoamylases from members of 
the genera Picrophilus and Thermoplasma that are 
active at 70 to 100°C and pH 0.7 to 3.0 may have 
biotechnological potential (12). The use of these en- 
zymes for starch degradation requires the develop- 
ment of large-scale recombinant expression. 

Cellulases and hemicellulases are also important in 
biotechnology. Cellulose is a linear polymer of B-1,4- 
linked glucose units and is the most abundant nat- 
ural polymer on Earth. Cellulolytic enzymes are ex- 
ploited in a variety of industrial fields (reviewed in 
reference 10). The crystalline structure of cellulose 
makes it a recalcitrant substrate for hydrolysis. Ther- 
mostable cellulases operating at high temperature allow 
the loosening of the wood fibers and provide access to 
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the hydrolytic enzymes. Therefore, extremophilic cel- 
lulases would be very useful alternatives to mesophilic 
counterparts in the production of fermentable sugars 
for fuel ethanol (150), in the biostoning of denim, and 
in detergent formulas to improve the color brighten- 
ing and softening by the biopolishing of the cotton 
fabrics. In particular, while commercial enzymes are 
active only at 50 to 55°C, this latter application re- 
quires cellulases that are stable at temperatures close 
to 100°C (5). 

The dominating component of hemicelluloses is 
xylan, a linear polymer of 8-1,4-linked xylose residues. 
Xylan degradation is one of the steps of the pulp and 
paper refining, since its hydrolysis facilitates the lignin 
removal during bleaching. Thermophilic xylanases 
would improve this process by disrupting the wood 
structure at elevated temperature, reducing the chlo- 
rine consumption and the consequent environmental 
pollution. However, these enzymes must be devoid of 
cellulolytic activity, should be of low molecular size to 
enable easier access to the pulp fibers, and should be 
available at very low costs (50). 

Interesting reviews on hyperthermophilic cellu- 
lases and hemicellulases are available in the literature 
(11, 130). These enzymes are uncommon in archaea, 
although cellulases have been found in P. furiosus (4, 
16), P. horikoshii (5), the archaeon AEPII1a (77), and 
in the thermoacidophile S. solfataricus (58, 79). Xy- 
lanase activity has been demonstrated in Thermococ- 
cus zilligii (140), although its unequivocal assignment 
is still disputed (110). Xylanases have also been re- 
ported in P. abyssi (7), Halorhabdus utahensis (146), 
and S. solfataricus (17). The xylanases from P. abyssi 
and S. solfataricus are highly thermostable and ac- 
tive up to 110°C and 90°C, respectively, while the en- 
zyme from H. utahensis is, quite surprisingly, active 
over a broad range of NaCl concentrations (0 to 
30%). Presently, the main drawback for the exploita- 
tion of archaeal cellulolytic and xylanolytic enzymes 
is the difficulty of expressing them in heterologous 
hosts (57), making their application to an industrial 
scale problematic. 


Another class of glycosidases that has biotechno- 
logical value is chitinases. Chitin is a waste polymer 
produced from the exoskeleton of crabs and shrimp, 
and it is used to produce chitosan that has a variety of 
applications in cosmetics, paper and photographic 
products, and medical fields (50). Extremophilic chiti- 
nases would be attractive alternatives to the current 
use of corrosive solutions of 40% sodium hydroxide, 
for the hydrolysis of chitin (50).Three hypothetical 
proteins have been found to constitute a new path- 
way for the degradation of chitin in Thermococcus 
kodakaraensis (132-134). This is a nice example il- 
lustrating how the biodiversity of archaea has en- 
hanced the potential for identifying novel biotechno- 
logical applications. 

Glycosidases are attractive enzymes also for the 
synthesis of carbohydrates, which are becoming in- 
creasingly important as therapeutics in the phar- 
maceutical industry (154). Glycosidases synthesize 
oligosaccharides in reactions of reverse hydrolysis 
(equilibrium-controlled synthesis) or transglycosyla- 
tion (kinetically controlled process) in which an al- 
cohol or another sugar acts as acceptor instead of 
water (Fig. 1). The B-glycosidase from the hyper- 
thermophilic archaeon S. solfataricus promotes trans- 
glycosylation reactions to pyranosides with a yield of 
10 to 40% (25). However, since the product of the re- 
action has the same anomeric configuration of the 
substrate, it is hydrolyzed by the enzyme with a con- 
sequent reduction in the final yield. To avoid these 
problems, a new class of engineered glycosidase has 
been produced to promote the synthesis of sugars 
with almost quantitative yields; these novel enzymatic 
activities have been termed glycosynthases (reviewed 
in reference 97). Three hyperthermophilic glycosyn- 
thases have been produced by engineering the B-gly- 
cosidases from S. solfataricus, Thermosphaera aggre- 
gans, and P. furiosus (87, 96). The approach used to 
obtain hyperthermophilic glycosynthases differs from 
that described for mesophilic enzymes (Fig. 2A). The 
active site carboxylate (glutamic acid), acting as the 
nucleophile of the reaction, was changed in a nonnu- 


Figure 1. Reaction mechanism of B-glycosidases. Hydrolysis occurs if “X” is a hydrogen (H) atom, or transglycosylation occurs 


if “X” is a different “R” group. 
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Figure 2. Reaction mechanism of hyperthermophilic glycosynthases. (A) Scheme of the glycosynthase mechanism: the transferred sugar is shown in bold. 
(B) Scheme of the processive glycosynthetic reaction: the transferred sugar in a chair conformation is shown with closed symbols; the aryl leaving group is 
shown as a hexagon. 
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cleophilic residue (glycine) by mutagenesis leading 
to a completely inactive enzyme. The mutant was re- 
activated at 65°C in the presence of sodium formate 
buffer, pH 4.0 (functioning as an external nucleophile), 
and aryl-glycoside (the activated substrate) (96). The 
mutant enzyme, assisted by the external nucleophile, 
formed the formyl-glucoside intermediate in situ. In 
the second step of the reaction, a substrate molecule, 
working as acceptor, resolves the intermediate leading 
to the product (Fig. 2A). The key to this approach is 
that the product accumulates in the reaction because 
it is a nonactivated disaccharide that cannot be hy- 
drolyzed by the mutant. When sufficient amounts of 
the disaccharide product are present, they compete 
with the initial substrate and act as new acceptors lead- 
ing to the formation of trisaccharides; the iteration of 
this process leads to the generation of oligosaccharides 
of up to four glycosidic residues containing branched 
chains with B-1,3 and B-1,6 bonds (138, 139) (Fig. 
2B). This is a clear example of how the unique char- 
acteristics of stability to high temperatures and acidic 
pH of the glycosidases from hyperthermophilic ar- 
chaea allowed the development of a novel strategy for 
the exploitation of oligosaccharide synthesis. 


Archaeal enzymes in molecular biology 


The amplification of DNA fragments by the well- 
defined technique of the PCR has led to a burst in the 
knowledge of life sciences, comprising molecular and 
cellular biology, genetic engineering, and biotechnol- 
ogy. The use of the DNA polymerase from the ther- 
mophilic bacterium Thermus aquaticus (Taq) bypassed 
the problem of the addition of a mesophilic enzyme at 
each cycle after the denaturation step (38), leading to 
a rapid development of the PCR technique that is 
widely used (22, 64). Taq DNA polymerase is used 
for a large number of PCR applications. However, the 
enzyme exhibits poor fidelity due to a lack of 3'-5' ex- 
onulease activity (proofreading activity). For this rea- 
son it is not used when high-fidelity amplification is 
required, such as during the analysis of allelic poly- 
morphisms, allelic stages of single cells, or rare mu- 
tations in human cells. 

Significant time and effort has been spent to im- 
prove the performance of Taq DNA polymerase (34). 
However, in the past decade, knowledge of archaeal 
DNA polymerase enzymes has led to their use in a 
range of PCR applications (Table 1). All archaeal DNA 
polymerases (B-type) (95) possess proofreading ac- 
tivity and lack 5'-3' exonuclease activity. These fea- 
tures of the archaeal enzymes have facilitated the de- 
velopment of PCR techniques by eliminating the need 
for downstream error-correction steps and minimiz- 


ing the number of clones that must be sequenced to 
obtain error-free constructs. However, most archaeal 
DNA polymerases also have disadvantages compared 
with the bacterial Taq polymerase: (i) exhibit limited 
processivity in vitro (the polymerization rate is in the 
range 9 to 25 s~!, compared with 47 to 61 s~! for 
Taq polymerase) and (ii) uracil residues (dU), are 
formed by the temperature-dependent deamination of 
dCTP during the PCR reaction, which causes synthe- 
sis to stall, thereby decreasing their performance (48). 
These characteristics preclude amplifications of DNA 
fragments longer than 3-4 kb since prolonged ex- 
tension times (1-2 min kb~! at 72°C) promote dUTP 
formation. 

The problem with low processivity has been over- 
come with the isolation of an archaeal DNA poly- 
merase from Thermococcus kodakaraensis, which ex- 
hibits a high extension rate (106 to 138 s7!) and low 
error rate (131), and is therefore able to perform longer 
PCR reactions similar to Tag DNA polymerase. The 
limitation of the dUTP formation was also overcome 
by the introduction of a new class of archaeal enzyme: 
dUTPases. These thermostable biocatalysts from Pyro- 
coccus spp. prevent the incorporation of dUTP in PCR 
products by transforming dUTP to dUMP, and conse- 
quently increasing the PCR final yield (53). 

Most currently available commercial products 
for the amplification of DNA are composed of a mix- 
ture of archaeal and bacterial thermostable DNA 
polymerases to ensure that low error rates and high 
processivity are achieved. Examples include Hercu- 
lase and PicoMaxx (Stratagene, USA), in addition to 
the dUTPase (ArchaeMaxx Polymerase Enhancing 
Factor), which minimizes uracil poisoning, resulting 
in amplification products as long as 19 kb (14, 52). 


Other enzymes 


Transaminases and aminotransferases are widely 
used in the food industry to improve the viscosity of 
products and to produce L- or D-amino acids as food 
additives and precursors of antibiotics for the manu- 
facture of pharmaceuticals and agricultural products 
(71). Patents have been filed for this class of enzymes 
from thermophilic bacteria and the hyperthermo- 
philic crenarchaeon P. aerophilum (147); the trans- 
aminase provided improved stability at high tempera- 
ture and in organic solvents for the production of 
optically pure chiral compounds. 

Alcohol dehydrogenases (ADHs) from hyperther- 
mophilic archaea have also found a role in biotech- 
nology industries (100). Hyperthermophilic ADHs 
can be useful for stereoselective transformation of 
ketones to alcohols, and vice versa. The ADH from 
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S. solfataricus is a zinc-containing, NAD(H)-dependent 
enzyme that is able to convert 3-methylbutan-2-one to 
(S)-3-methylbutan-2-ol with almost 100% stereoselec- 
tivity (39, 101). The ADH from A. pernix produces an 
enantiomeric excess of about 90% with 2-octanone, 
2-nonanone, and 2-decanone substrates (51). In con- 
trast, although the ADH from P. furiosus has broad 
substrate specificity, it exhibited only a slight prefer- 
ence for (S)-2-butanol (141). 

Several other archaeal enzymes have attractive 
properties for chemical synthesis. A cysteine synthase 
from A. pernix has been used for the production of 
sulfur-containing organic compounds (61). A phos- 
phatidylethanolamine N-methyltransferase from P. fu- 
riosus (82) catalyzes the synthesis of phosphatidyl- 
choline, which is used in food as a digestible surfactant, 
and in the medical and pharmaceutical industries as 
a component of microcapsules for drugs. The stability 
of the enzyme in organic solvents and its ability to syn- 
thesize phosphatidylcholine of high optical purity make 
it a good alternative to phosphatidylethanolamine 
N-methyltransferases from bacteria and yeast. 

An interesting example of how an archaeal ex- 
tremozyme can be developed for use in industry is the 
L-aminoacylase from Thermococcus litoralis (135). 
The enzyme, which catalyzes the hydrolysis of a va- 
riety of N-acyl groups from several amino acids, was 
isolated and characterized in a research program in- 
volving Chirotech Technology (UK) and the Univer- 
sity of Exeter (UK). The ability of the L-aminoacylase 
from T. litoralis to remove aromatic groups from 
N-acylamino acids, is a feature that distinguishes it 
from the commercially available L-aminoacylase from 
the Amano Enzyme Inc. (Japan). In addition, the ar- 
chaeal enzyme was enantiospecific for L-amino acids, 
was active in a broad range of pH values (at least 
70% of maximum activity in the pH range 6.5 to 
9.5), showed maximum activity at 85°C, and a half- 
life of 25 h at 70°C. To produce a commercial prod- 
uct, the enzyme was expressed in Escherichia coli at 
a 500-liter scale and was partially purified to prevent 
contamination by its genetically modified host. The 
downstream process involved only a 2-fold purifica- 
tion with the final preparation being a cell-free ex- 
tract containing about 80 million units from 30 to 
40 kg of cells. Several commercial applications of the 
L-aminoacylase from T. litoralis are reported in the 
Dowpharma Chirotech Technology catalog (135). 
The archaeal enzyme is reported to improve the res- 
olution of an N-acyl-protected amino acid: 240 units 
of enzyme g`! of substrate allowed the production 
of more than 100 kg of product in 8 to 10 h at 70°C. 
The same process performed by heating the racemic 
N-acylamino acid to 50°C completely inactivated the 
mesophilic counterpart. 


MOLECULES FROM ARCHAEA 


Inteins 


Since their discovery in 1990, inteins have been 
studied for their fascinating chemical mechanism of 
cleaving and rejoining proteins. Although their ex- 
ploitation is still in their infancy, they provide enor- 
mous opportunities for developing novel tools for 
protein engineering and for understanding of the fun- 
damental mechanism of protein splicing. This is an 
extraordinary posttranslational processing event that 
involves the precise removal of an internal polypep- 
tide segment, termed intein, from a precursor protein, 
with the concomitant ligation of the flanking poly- 
peptide sequences, termed exteins (Fig. 3) (94). 

Due to rapid splicing in vivo, early in vitro stud- 
ies of the chemical mechanism of protein splicing 
were hampered by the inability to purify precursor 
proteins or splicing intermediates. The discovery of 
inteins from hyperthermophilic archaea was funda- 
mental to overcoming this obstacle. The Pyrococcus 
Psp intein-1 only undergoes splicing at high tempera- 
ture. The unspliced precursor protein and the splicing 
process were able to be examined in vitro by inducing 
splicing using high temperature (152). 

Inteins have been identified in 17 of 22 archaeal 
genome sequences (including Nanoarchaeum equi- 
tans) (http://bioinformatics.weizmann.ac.il/~pietro/ 
inteins; 21, 93). They were not found in the genome 
sequences of A. fulgidus, Methanosarcina acetivorans, 
P. aeropyrum, S. solfataricus, and S. acidocaldarius. 
The crystal structure of the intein spliced out from a 
ribonucleotide reductase from P. furiosus (PI-Pful) has 
been reported (59). 

An intein-mediated affinity protein purification 
system has been developed by New England Biolabs 
(23, 24) and it is called IMPACT (Intein Mediated Pu- 
rification with an Affinity Chitin binding Tag). This 
system allows the purification of a candidate protein 
by thiol-induced cleavage of a fusion protein that is 
bound to a chitin column. The intein-based purifica- 
tion technology avoids the use of exogenous proteases 
that can degrade the protein of interest, thereby offer- 
ing a time-saving and economical approach. 

Since the establishment of the first intein purifi- 
cation system, many advances have been made and 
several other intein fusion systems developed (153). 
For instance, after their discovery mini-inteins replaced 
the full-length inteins. Their smaller molecular weights 
can increase protein expression and final yield. One 
of the smallest mini-intein known has been found in 
the ribonucleoside diphosphate reductase gene of the 
archaeon Methanothermobacter thermautotrophicus 
(Mth RIR1 intein) (126). The Mth RIR1 intein, the in- 


CHAPTER 22 ¢ BIOTECHNOLOGY 


487 


OI 


HS. 
ThA 
f N-extein ( N—(Cinten G~- -NTI 
“Ww H 
| 
Precursor U 


N-S Acyl Shift 


comm 


() 
<1 
HN——Cintein_(— 4 'c-extein(] 


Thioester H, 
intermediate l 
Transesterification 
Oo 
C N-extein C ama 
HS 


0 
| H 


Branched 
7 7 HN ( intein | m . 
intermediate ° G A C-extein () 
H2 


i Peptide cleavage 
cmm Succinamide Formation 


HS. 


a b n 


S-N Acyl Shift Succinamide 


Hydrolysis 


HS 
a + 
H 


Spliced extein 


Figure 3. The intein self-catalytic protein-splicing mechanism is shown. 


Excised intein 


488 MORACCI ET AL. 


tein from the Mycobacterium xenopi gyrA gene (Mxe 
GyrA intein), and the mini-intein made from the Syne- 
chocystis spp. DnaB intein are commercially available 
from New England Biolabs as the IMPACT-TWIN 
(Intein Mediated Purifcation with an Affinity Chitin- 
binding Tag-Two Intein) E. coli expression vectors. 


Polymers 


Biopolymers have been identified in different 
groups of archaea, including bioplastics from halo- 
philes and exopolysaccharides (EPS) from halophiles 
and thermophiles. T. litoralis secretes EPS when grown 
on maltose (104), and Sulfolobus sp. produces a sul- 
fated exopolysaccharide containing glucose, mannose, 
glucosamine, and galactose when grown on glucose 
in stationary phase (91). Microbial exopolysaccha- 
rides are used as stabilizers, thickeners, and emulsi- 
fiers in several industries (e.g., pharmaceutical, tex- 
tile, paper), and the major commercial EPS derives 
from the bacterium Xanthomonas campestris (143). 
The archaeon Haloferax mediterranei secretes up to 
3 g liter~! of a highly sulphated and acidic EPS that 
contains mannose as a major component (8). 

In the past decade it was suggested that an acidic 
heteropolysaccharide from H. mediterranei, which 
showed excellent rheological properties, could have 
potential application as emulsifying agent in the oil 
industry (143). Residual oil in natural oil fields can be 
extracted by injection of pressurized water into a new 
well; in these circumstances, halobacterial membrane 
lipids and EPSs from this archaeon may be useful due 
to their biosurface activity and bioemulsifying prop- 
erties (143). 

Among biopolymers, bioplastics have, by far, the 
widest range of biotechnological applications; these 
compounds are produced from polyhydroxyalka- 
noates (PHAs), a heterogeneous family of polyesters 
of which poly-B-hydroxybutyrate (PHB) is the most 
common. Several marine archaea accumulate large 
amounts (25 to 80% of dry weight) of intracellular 
PHA that are subsequently metabolized for carbon 
and energy during periods of starvation (148). PHAs 
are used in industry for the production of thermo- 
plastic polymers, the properties of which resemble 
polystyrene, polycarbonate, and polypropylene. How- 
ever, these bioplastics are biodegradable. For instance, 
PHB has a structure similar to polypropylene and 
sinks to the sediment where it can be degraded in ap- 
proximately one month, rather than in one year as is 
typically the case for polypropylene (148). The ar- 
chaeon H. mediterranei produces large amounts of 
PHA when grown on starch or glucose (78). The car- 
bon source (starch instead of glucose used by bacterial 


halophiles) is relatively inexpensive, and yields are 
higher due to the large accumulation of polymer and 
ease of cell lysis (143). 

Several efforts to commercialize PHA, notably 
by ICI in the 1980s and early 1990s under the trade 
name of Biopol, and by Monsanto in the mid-1990s, 
foundered due to problems associated with high cost 
and a very limited ability to process PHA. In 2001, 
Metabolix purchased Monsanto’s Biopol assets to 
add to its current range of PHA. These products in- 
clude molding resins, coatings for water-resistant pa- 
per bags, films, adhesives, and fibers for textiles and 
carpets (/www.metabolix.com/), thereby illustrating 
the real potential for application of archaeal PHA. 


Compatible Solutes 


Microorganisms cope with high osmolarity by 
uptaking of K* from the environment, or by synthe- 
sizing compatible solutes such as amino acids and 
their derivatives, sugars and their derivatives, poly- 
ols, betains, and ectoins (27, 44, 105, 107, 112) (see 
Chapter 16). Compatible solutes are small, highly 
soluble, organic molecules that do not interfere with 
central metabolism, even if they accumulate to high 
concentrations. Compatible solutes from archaea 
have biotechnological roles as cryoprotectants and 
preservatives. 

Some thermophilic and hyperthermophilic organ- 
isms accumulate di-myo-inositol-phosphate, di-man- 
nosyl-di-myo-inositol-phosphate, di-glycerol-phos- 
phate, mannosylglycerate, and mannosylglyceramide. 
Compatible solutes from archaea, including P. woe- 
sei, Pyrodictium sp., Thermococcus sp., and A. fulgi- 
dus, tend to be negatively charged, although P. aero- 
philum and Sulfolobus sp. accumulate the neutral 
compatible solute, trehalose, which is widespread in 
nature. In addition to osmoadaptation, compatible 
solutes may protect cells from dehydration, freezing, 
desiccation, high temperature, and oxygen radicals 
(see 112 and references therein). Due to the wide range 
of protective effects offered by compatible solutes, 
they have several biotechnological applications, in- 
cluding the conservation of tissues and other biologi- 
cal products in medicine, food, and cosmetic industry, 
and to preserve cultures and cell lines in scientific re- 
search (116). 

Trehalose and mannosylglycerate have found 
particular biotechnological applications. Trehalose 
(a-D-glucopyranosyl-a-D-glucopyranoside) is currently 
synthesized using two enzymes from Arthrobacter sp. 
The first enzyme, called Q36, is a trehalosyl-dextrin 
forming enzyme (EC 5.4.99.15) that converts dextrins 
to trehalosyl-dextrin, and a trehalose-forming enzyme 
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(EC 3.2.1.141) releases trehalose and low-molecular- 
weight dextrins from trehalosyl-dextrins (116). The 
two enzymes work sequentially at 45°C. The use of 
thermophilic enzymes may help to reduce viscosity 
and microbial contamination. The same two enzymes 
have been identified in S. solfataricus strains KM1 
and P2 (73; www-archbac.u-psud. fr/Projects/Sulfo- 
lobus/Sulfolobus.html), in S. shibatae (30), and in S. 
tokodaii (www.bio.nite.go.jp/dogan/Top). When the 
Sulfolobus enzymes were expressed in E. coli in a 
high-density microfiltration bioreactor, the recombi- 
nant cells produced trehalose from dextrins at 75°C 
with a final conversion of 90% (31). This study illus- 
trates the true potential for industrial application. 

Mannosylglycerate has potential use as bioprotec- 
tant. This compound, together with di-myo-inositol- 
phosphate, accumulates in Pyrococcus sp. with in- 
creasing NaCl concentration of the medium (112). 
The use of mannosylglycerate as an enzyme stabilizer 
has been patented (113) and studied extensively (13). 
It is considerably better than trehalose as thermopro- 
tectant for enzymes, and as good as trehalose for pro- 
tecting enzymes against desiccation (113). For exam- 
ple, after incubation for 10 min at 50°C, the residual 
activity of rabbit muscle lactate dehydrogenase in- 
creases from 5% to 90% in the presence of manno- 
sylglycerate and trehalose, respectively (113). The 
patent showed that mannosylglycerate could be pu- 
rified from ethanolic extracts of several archaea, in- 
cluding P. furiosus, P. woesei, Methanothermus spp., 
and Thermococcus spp. grown in complex medium at 
supraoptimal temperatures and salinities. For P. fu- 
riosus, 2.5 g of mannosylglycerate was obtained from 
100 g of cell paste (113). 


Self-Assembling Components 


Self-assembling S-layer glycoproteins (118, 125) 
and bacteriorhodopsins (15, 143, 151) have drawn 
interest for their potential in molecular nanotechnol- 
ogy. S layers are commonly observed as cell envelopes 
of bacteria and archaea (see Chapter 14). The S layers 
of these organisms are particularly appealing for bio- 
technology because they form a regular lattice that is 
very stable. In addition, because each unit of the S-layer 
lattice possesses identical physicochemical properties, 
they form unique structures. The mechanical and per- 
meability properties of these structures has enabled 
S layers to be used as ultrafiltration membranes that 
exhibit precise molecular exclusion properties, and 
are therefore interesting alternatives to conventional 
membranes (125). Moreover, S layers from many ar- 
chaea and some bacteria are glycosylated (36, 114), 
which might allow macromolecules to be immobi- 


lized not only on the protein moiety, but also on the 
carbohydrate residues. 

Bacteriorhodopsin is a photoactive proton pump 
present in the purple membrane of halophilic archaea, 
including Halobacterium salinarium. At low oxygen 
availability, bacteriorhodopsin trimers are assembled 
in a two-dimensional hexagonal lattice in the purple 
membrane. The primary function of bacteriorhodopsin 
is to pump protons (see Chapter 16). The main bio- 
technology application of this protein resides in its 
ability to absorb light energy and convert it to chemi- 
cal energy. Bacteriorhodopsin-based devices have been 
used since the 1980s for optical data processing (holog- 
raphy, information storage), as biochips, and as light 
sensors obtained by sandwiching the protein between 
an oxide electrode and an electrically conductive gel 
(143). In the late 1970s, the Soviet military explored 
the potential of bioelectronics in the computer tech- 
nology in a program called “Project Rhodopsin.” The 
project led to the production of bio-chrom, a real-time 
photochromic and holographic film containing bacte- 
riorhodopsin (15). More recently, a new generation of 
bacteriorhodopsin-based devices were developed using 
genetic manipulation, which led to the generation of 
a mutant with a 700-fold improvement in volumetric 
data storage (reviewed in rerference 151). 


ARCHAEAL WHOLE CELLS 


The utilization of archaeal cells in biotechnologi- 
cal applications is limited by the low biomass yield of 
cells, and the availability of molecular biology tools 
(see Chapter 21). The biotechnology of archaeal whole 
cells is still in its infancy, and the applications are lim- 
ited to the treatment of contaminated soil and waste- 
water (bioremediation) and hydrogen production. 


Bioremediation 


The treatment of contaminated soil and waste- 
water with microorganisms to remove toxic organic 
compounds, metals, etc., is an established practice that 
is undergoing continous expansion. Although bacte- 
ria have been exploited (e.g., treatment of the petro- 
leum wastes; reviewed in reference 142), archaea have 
not been widely used. Nevertheless, studies on the mi- 
crobial populations associated with treatment of con- 
taminated soil and water revealed the presence of ar- 
chaea. For example, methanogenic archaea were found 
as syntropic consortia in the leachate of a full-scale 
recirculating landfill (57), in anaerobic reactors for 
the treatment of industrial dye effluents (98) and do- 
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mestic wastewater (84), in soil decontaminated by 
steam treatment (103), and in pharmaceutical and al- 
cohol distillery wastewater (2). Most of the archaea 
identified belong to species of Methanosaeta, Metha- 
nobacterium, and Methanospirillum. In contrast to 
these examples, archaeal populations were generally 
reduced in oil-polluted beach ecosystems, suggesting 
that they play small role in oil degradation in com- 
parison with bacteria (109). 

The presence of methanogenic archaea in conta- 
minated anaerobic environments is consistent with 
their anaerobic lifestyle. However, the importance of 
these microorganisms in aerobic wastewater has re- 
ceived little attention. More recently, anoxic micro- 
environments containing methanogenic archaea have 
been identified in industrial and domestic aerated acti- 
vated sludge plants (47). These types of studies (47) are 
particularly useful as they provide an improved under- 
standing of the ecological forces governing the waste- 
water treatment and may therefore help in optimizing 
the process. The study highlighted that the dynamics 
of microbial-driven processes are often determined by 
minority populations. One remarkable example of this 
type of regulation was recently reported for an acti- 
vated sludge reactor (63). Periodic addition of solu- 
tions containing anaerobic methanogenic archaea to a 
conventional activated sludge system led to increased 
nitrogen removal and to sludge reduction. This method 
has been adopted by ArchaeaSolutions, Inc. (www 
.archaeasolutions.com, Tyrone, Ga.) in the treatment 
of wastewater from winery industries. 


Hydrogen Production 


Hydrogen gas is an attractive alternative to fossil 
fuel as it is a clean, nonpolluting source of energy. 
However, conventional production is based, at pre- 
sent, on the steam reforming of natural gas and pe- 
troleum, and microbial production of H; is gaining 
increasing interest. Many studies on the microbial 
production of H, have focused on the bacterial gen- 
era, Clostridium (68) and Enterobacter (75). In the 
early 1990s, P. furiosus was shown to efficiently pro- 
duce H, when grown on rich medium (115), suggest- 
ing that archaea can also play a role in this field. 
More recently, T. kodakaraensis KOD1 was shown to 
produce H; at a rate ranging from 14 to 59.6 mmol 
per gram dry weight h~! (depending on the dilution 
rate), when grown at 85°C on rich medium supple- 
mented with pyruvate or starch (66). These values are 
comparable to those reported for Enterobacter cloa- 
cae (29.6 mmol per gram dry weight h~t). The higher 
operational temperatures of the hyperthermophilic 
archaea provide the advantage of eliminating the risks 
of microbial contamination and the ability to perform 


starch liquefaction thereby increasing the solubility of 
the substrate. These preliminary reports illustrate a 
high potential for using carbohydrate-degrading ar- 
chaea for H, production. 


PERSPECTIVE: THE NEXT FIVE YEARS 


Extremophilic archaea have been considered an 
interesting source of molecules for novel biotechno- 
logical applications. Their stability and activity to 
extreme conditions make them useful alternatives to 
labile mesophilic counterparts. Therefore, it is not 
surprising that a number of companies worldwide ex- 
ploit extremophilic organisms and their biomolecules. 
Nevertheless, products from archaea represent a small 
part of commerical catalogs. One of the leading bio- 
technology companies, Genencor (USA), has no prod- 
ucts, either in development or on the market, that are 
derived from archaea, despite the fact that they deal 
with enzymes from extremophiles. Presently, only two 
companies are fully committed to the exploitation of 
archaea and their biomolecules: Archaezyme Ltd., 
Jerusalem, Israel, and ArchaeaSolutions Inc., Tyrone, 
Georgia, USA. However, this is a rapidly evolving 
field. In the past year, numerous exciting findings that 
have potential relevance to biotechnology have been 
reported, two of which are the genome sequence of 
the extreme acidophile P. torridus (43) and a new 
mechanism of gene expression in N. equitans (102). 
Such discoveries hold great expectations for the fu- 
ture. There is much to be gained from the exploita- 
tion of archaea, and this will be fostered not only by 
scientific advances, but also by the willingness of in- 
dustry and government to make suitable investment. 
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Chapter 23 


Archaeosome Vaccines 


G. DENNIS SPROTT AND LAKSHMI KRISHNAN 


INTRODUCTION 


The discovery that hydration of polar lipids, 
such as phosphatidylglycerol and phosphatidylcho- 
line, causes the spontaneous formation of liposomes 
(2), led to concepts for use of liposomes as biochemi- 
cal tools (19), drug delivery systems (23), and anti- 
gen carrier systems for vaccines (1). A description of 
these important data that laid the foundation for li- 
posome methodologies can be found elsewhere (39). 

The polar membrane lipids from archaea are 
structurally unique (see Chapter 15) and appear to 
have uses in biotechnology. Predating the discovery of 
Archaea as a domain of life, novel nonsaponifiable 
lipids were discovered in the archaeon Halobacterium 
salinarum (52). Subsequently, it was established that 
the polar lipids of this extremely halophilic archaeon 
are based on archaeol (2,3-di-O-sn-phytanylglycerol), 
a novel lipid with a glycerol backbone linked at car- 
bons 2 and 3 via ether bonds to two saturated, carbon- 
20, isopranoid chains (26) (see Chapter 15). It is now 
known that the archaeol lipid structure is a distin- 
guishing and ubiquitous feature in Archaea. In addi- 
tion, the dimer form of archaeol, caldarchaeol, and 
other more subtle variations to these core lipids are 
characteristics peculiar to each strain of archaeon (27). 

The term liposome defines a bilayer arrangement 
of lipids and does not accurately describe archaeal 
lipid “liposomes” that contain caldarchaeol bipolar, 
membrane-spanning lipids. To account for the pres- 
ence of unilayer membrane regions described below 
(see “Stability,” below), as well as for simplicity, “ar- 
chaeosome” was proposed to replace “archaeal lipid 
liposome” as a new term to describe the closed lipid 
vesicles prepared from archaeal lipids (55). 

As members of the Archaea can thrive in extreme 
environments, their polar lipids are expected to be 
stable and therefore to be of value for a range of bio- 


technological applications. Archaeosomes have been 
developed from various archaea for use as drug de- 
livery systems (11, 44, 49) and vaccine applications 
that utilize their adjuvant properties (29, 63, 66). 

This chapter describes the mechanism of ar- 
chaeosome adjuvants as self-adjuvanting antigen car- 
rier systems that are taken up by specific receptor- 
mediated endocytosis to promote both CD4* and 
CD8* T-cell responses. The effects of lipid structure 
on the immune response are also reviewed. 


PREPARATION OF ARCHAEOSOMES 


Archaeal lipids possess several features that 
make them ideal for the preparation of archaeo- 
somes. The first is the inherent stability of the polar 
lipids, allowing long-term storage of these lipids in air 
without oxidation or other chemical changes. This 
stability results from the fully saturated nature of the 
isopranoid chains and the stable ether linkages be- 
tween the chains and the glycerol backbone. Excep- 
tions to this rule of archaeal lipid saturation may be 
found, as the polar lipids of some cold-adapted ar- 
chaea exhibit unsaturation of their isoprenoid chains 
(20, 40), and these sources have been excluded from 
most applied studies. The archaeal sn-2,3 stereo- 
chemistry versus sn-1,2 found in glycerolipids of the 
Bacteria and Eucarya may influence their susceptibil- 
ity to enzymatic attack (27) and contribute to stability 
properties. Second, archaeal lipids form archaeo- 
somes over physiological temperature ranges, allow- 
ing preparation of vaccines at ambient temperatures. 
Formation of liposomes from nonarchaeal lipids must 
be performed above the phase transition temperature, 
where heat sensitivity of the active ingredient being 
entrapped could be an issue. Third, once formed, ar- 
chaeosomes of 50- to 250-nm diameters remain sus- 
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pended indefinitely and resist fusion or aggregation 
over long storage periods. The stability of archaeo- 
some membranes accounts for the long retention times 
for entrapped compounds (11, 45, 49). 

A historical perspective on formation of ar- 
chaeosomes has been described in a review (47). 
Methods to prepare archaeosomes are similar to the 
methods used to prepare liposomes from nonarchaeal 
lipids (60). In brief, the preparation of archaeosomes 
involves growing the archaeon of interest, extracting 
total lipids, and precipitating the total polar lipid 
fraction (the bulk of the lipid) with cold acetone. Dry- 
ing an aliquot of the total polar lipids and hydrating 
in the presence of antigen forms multilamellar ar- 
chaeosomes containing entrapped antigen. In general, 
hydration occurs readily with lipid extracts high in 
archaeols but is more difficult and occurs with lower 
archaeosome yield if the content of caldarchaeols is 
exceptionally high (e.g., total polar lipids from Ther- 
moplasma acidophilum or Sulfolobus acidocaldar- 
ius). The size of archaeosomes is usually reduced by 
brief sonication in a sonic water bath or by extrusion 
through membranes of defined porosity (3) prior to 
removal of any nonentrapped antigen, usually by cen- 
trifugation. The final steps are filtration through a 
sterile 0.45-um-pore-size filter and quantification of 
entrapped antigen. Archaeosome size is measured in a 
submicron particle sizer and is typically 50 to 250 nm. 
Each archaeon has its own unique, total polar lipid 
composition that, in turn, imparts unique properties 
to the resulting archaeosomes. 

To avoid the necessity of purifying lipids for ar- 
chaeosome preparations and to potentially achieve 
multiple receptor interactions, most studies that have 
used archaeosomes as adjuvants included the total po- 
lar lipids from a chosen archaeon. Using total polar 
lipids also has the advantage of producing a mixture 
of polar lipids that form stable archaeosomes (10). 


STABILITY 


An important consideration relating to the stabil- 
ity of archaeosomes is the proportion of archaeol to 
caldarchaeol lipids in the preparation. This in turn may 
have profound effects on vaccine adjuvant properties. 

Bipolar caldarchaeol lipids span lipid films and 
intact cell membranes with a polar group facing each 
side (21, 33) (see Chapter 15). Archaea, such as 
H. salinarum, are incapable of caldarchaeol synthesis. 
Many other archaea (e.g., T. acidophilum) generate a 
mixture of archaeols and caldarchaeols. As a result, 
three different membrane structures occur in ar- 
chaeosomes depending on the archaeal lipid source 
(Fig. 1). A bilayer may form from archaeal lipid ex- 


Figure 1. Archaeol and caldarchaeol polar lipid membrane mod- 
els. (A) Model lipids are archaetidylglycerol arranged to depict a 
bilayer archaeosome membrane. Regular methyl branches of the 
isopranoid chains are shown. (B) The membrane is depicted as a 
unilayer composed of the main polar lipid of T. acidophilum (64). 
(C) This model of an archaeosome membrane consists of a mixture 
of A and B polar lipids. 
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tracts containing 100% archaeols (Fig. 1A). A mem- 
brane unilayer is typical for lipids from T. aci- 
dophilum with 90% caldarchaeol (Fig. 1B), while a 
combination of bilayer and unilayer arrangements 
may be expected for the total polar lipids from ar- 
chaea such as Methanobrevibacter smithii (Fig. 1C). 

Freeze-fracture results corroborate the mem- 
brane arrangements predicted for archaeosomes pre- 
pared from various total polar lipid sources (Fig. 2). 
The frequency of intramembrane fractures occurring 
along the hydrophobic plane of the membrane of ar- 
chaeal cell plasma membranes (and archaeosomes) 
correlate with the amount of caldarchaeols present 
(Fig. 2) (4). Intramembrane fractures were essentially 
absent at a content of caldarchaeols greater than 
50%. During freeze fracturing, intramembrane par- 
ticles were seen when fractures occasionally occurred 
along the hydrophobic membrane domain. However, 
this only occurred in cases where archaeosomes con- 
sisted of mixtures of archaeols and caldarchaeols. As 
these archaeosomes were protein free and the density 
of the intramembrane particles correlated with cal- 
darchaeol content, it was concluded that caldarchaeol 
complexes must form these (intramembrane) lipid 
raftlike particles (4). The physiological significance of 
these particles is unknown. 

Archaeosomes consisting largely of caldarchaeol 
lipids are unusually stable to thermally induced leak- 
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Figure 2. Correlation between the caldarchaeol content of archaeal 
membranes and the frequency of intramembrane fractures ob- 
served by freeze-fracture electron microscopy. Similar data were 
obtained for freeze-fractured archaeosomes prepared from the to- 
tal polar lipids extracted from the same archaea. Modified from the 
Journal of Bacteriology (4) with permission of the publisher. 


age of solutes, and exhibit low ion permeabilities 
(7, 16, 49). The thermal stability during autoclaving 
of archaeosomes with different proportions of cal- 
darchaeol lipids was examined (Fig. 3). The method 
used was to first load archaeosomes with self-quench- 
ing amounts of carboxyfluorescein (CF). An increase 
in fluorescence occurred as CF leaked from the ar- 
chaeosomes during autoclaving and diluted into the 
suspending fluid. It was found that CF leakage de- 
clined as the caldarchaeol content increased, reaching 
a plateau at about 50%. This plateau at about 50% 
caldarchaeol content is similar to the caldarchaeol 
content that prevented intramembrane cleavage dur- 
ing freeze fracturing (Fig. 2). Thermal stability was 
higher for Methanococcus jannaschii archaeosomes 
than for M. smithii archaeosomes, despite similar cal- 
darchaeol content, but this is likely to reflect the ad- 
ditional stabilizing effect of macrocyclic archaeols 
from M. jannaschii. These data clearly indicate that 
caldarchaeol membrane-spanning lipids enhance 
thermostability. 

Permeability has been examined for several total 
polar lipid compositions of archaeosomes, compared 
with liposomes prepared from Escherichia coli lipids 
and from commercial 1,2-di-O-phytanyl-sn-glycero- 
3-phosphocholine (ester linkages to phytanyl chains) 
(Fig. 4) (35). These and other data (35) support the 
conclusion that the ether bonds of archaeal lipids 
(versus ester bonds), and particularly caldarchaeol 
lipids, decrease the proton permeability of archaeo- 
somes (Fig. 4A). Furthermore, limiting the mobility of 
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Figure 3. Correlation between the caldarchaeol content of ar- 
chaeosomes and their thermal stabilities. Carboxyfluorescein (CF) 
was entrapped within archaeosomes made of total polar lipids 
from M. mazei (Mm), Methanosphaera stadtmanae (Mst), M. smithii 
(Ms), M. jannaschii (Mj), Methanospirillum hungatei (Mh), Meth- 
anobacterium espanolae (Me), and T. acidophilum (Ta). Retention 
of CF was measured following autoclaving for 15 min at 121°C. 
Modified from the Canadian Journal of Microbiology (12) with 
permission of the publisher. 
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Figure 4. Proton and glycerol permeabilities of liposomes and 
archaeosomes. Permeability rates were measured in lipid vesicles 
prepared from the ester lipid diphytanylphosphatidylcholine (Dph- 
PC), total polar lipids of E. coli (E. coli), H. salinarum (As), 
M. smithii (As + Co), M. jannaschii (Am + Co + As), and T. aci- 
dophilum (Cp + As). Symbols indicate archaeols (As), caldarchae- 
ols (Cs), macrocyclic archaeols (Am), and caldarchaeols with cy- 
clopentane rings (Cp). Reproduced from the Journal of Biological 
Chemistry (35) with permission of the publisher. 


hydrocarbon chains (caldarchaeols and macrocyclic 
archaeols) plays a role in decreasing permeability to 
glycerol (Fig. 4B) and other small molecules (35). 
The extreme environmental condition in which 
an archaeon thrives might be expected to correlate 
with the stability of their archaeosomes. However, 
this is not always the case. For example, archaeo- 
somes prepared from the polar lipids of alkaliphilic 
Natronobacterium magadii were sensitive to alkaline 
pH (11). As described above (Fig. 3), thermostability 
of archaeosomes correlates with caldarchaeol con- 
tent. However, some hyperthermophiles, such as 
Methanopyrus kandleri, lack, or have a low content 
of, caldarchaeol lipids (53), indicating from a physi- 
ological perspective that the presence of a high 
amount of caldarchaeols is not necessarily a prereq- 
uisite to extreme thermophily. In addition, the total 
polar lipids from extremely thermophilic archaea do 


not appear to be structurally very different from those 
from other archaea (53). In contrast to these examples, 
the lipids from extremely halophilic archaea form ar- 
chaeosomes in salt concentrations up to 4 M, whereas 
liposomes from E. coli (67) or archaeosomes from to- 
tal polar lipids of nonhalophilic archaea do not form in 
high-salt concentrations. The mechanism of salt sta- 
bility for these archaeosomes is correlated with the 
content of the diacidic lipid, archaetidylglycerol phos- 
phate-O-methy]l (see Chapter 15), a lipid that is unique 
to and abundant in extremely halophilic archaea (65). 


INTERACTION WITH ANTIGEN- 
PRESENTING CELLS 


Macrophages, and especially dendritic cells (DCs), 
are the professional antigen-presenting cells (APCs) 
that initiate the immune response and are the critical 
cell types for adjuvant action. It has been known since 
the 1970s that liposomes are taken up by phagocytosis 
and adjuvant the immune response by antigen delivery 
(1). The concept of using archaeal lipids as adjuvants 
to deliver antigens with higher efficiency than conven- 
tional liposomes stemmed from the stability properties 
of archaeosomes. However, lower efficiency was also 
considered to be possible if the archaeosomes proved 
to be too stable to release their antigen cargo. 

Much of the recent activity in adjuvant design 
has been centered on discovery of molecules associ- 
ated with pathogens (pathogen-associated molecular 
patterns) that are agonists for Toll-like receptors 
found on APCs (51). Archaea are not known to be as- 
sociated with humans except in anaerobic niches such 
as the intestinal tract, where M. smithii and Methano- 
sphaera stadtmanae generate methane as part of the 
normal microbiota (36). Because archaea have not 
been linked to pathogenic activity and lack endotox- 
ins (6, 22), interaction of archaeal molecules with 
mammalian receptors found on APCs, such as the 
Toll-like danger receptors designed to recognize and 
prompt the immune response to the presence of 
pathogens, would not be expected. Therefore it was 
a serendipitous discovery that archaeal lipids proved 
to be good adjuvants capable of activating the im- 
mune system of mammals. 


Endocytosis 


In macrophages cultures, the uptake of archaeo- 
somes derived from several lipid compositions is 3 to 
53 times higher than the uptake of conventional lipo- 
somes (66). Prior to this study, uptake of archaeo- 
somes by mammalian cells had not been assessed. The 
results were unexpected and highlighted the poten- 
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tial utility of archaeosomes as antigen carriers. Non- 
phagocytic cell cultures took up comparatively little of 
the archaeosomes, indicating that uptake was through 
a phagocytic mechanism. Cytochalasins, inhibitors of 
phagocytosis, were used to confirm that uptake was 
by phagocytosis and to show that following internal- 
ization the structural integrity of the archaeosomes 
was lost over time (24, 66). This observation was im- 
portant as it indicated that the archaeal delivery sys- 
tem was not too stable to release the antigen for pro- 
cessing and presentation. Incubation of fluorescent 
archaeosomes with macrophages and DCs resulted in 
the rapid accumulation of hot spots of archaeosomes 
(Fig. 5), indicating internalization of many archaeo- 
somes within each phagosome. 

Archaeosomes prepared from the total polar 
lipids of M. smithii have proven to be especially strong 
adjuvants. Recently, several reasons for this activity 
have been elucidated. First, the polar lipids are ap- 
proximately 60% archaeols and 40% caldarchaeols, 


Figure 5. (See the separate color insert for the color version of this 
illustration.) Uptake of fluorescent archaeosomes by phagocytic 
cells. Archaeosomes composed of total polar lipids were prepared 
either by incorporating a small amount of the fluorescent lipid rho- 
damine-phosphatidylethanolamine (62) or by entrapping 1.5 mM 
carboxyfluorescein (66). Uptake was performed in 1 ml of RPMI 
medium added to 0.5 million adhered cells (62). Panels show 
30-min uptakes: (A) M. smithii rhodamine-archaeosomes (100 pg) 
by thioglycollate-activated mouse peritoneal macrophages; (B) up- 
take of M. smithii rhodamine-archaeosomes (25 pg) by bone 
marrow-derived DCs; (C) uptake of M. mazei carboxyfluorescein- 
archaeosomes (40 ug) by macrophages culture J774A.1; and 
(D) uptake of H. salinarum rhodamine-archaeosomes (100 ug) by 
thioglycollate-activated mouse peritoneal macrophages. 


thereby imparting enhanced stability. Second, the total 
polar lipids are exceptionally high in lipids with phos- 
phoserine headgroups (Fig. 6). Evidence for endocyto- 
sis (receptor mediated phagocytosis) of M. smithii ar- 
chaeosomes via a phosphatidylserine (PS) receptor 
present on APCs is described below (see “Presentation 
pathway,” below). Third, the total polar lipids of M. 
smithii form archaeosomes readily, and good yields 
are recovered following filter sterilization. Studies ex- 
amining the mechanism of action of archaeosome ad- 
juvants have tended to use archaeosomes prepared 
from the total polar lipids of M. smithii. 


Costimulation 


Two signals are required to activate T cells. The 
first is antigen presentation in the context of MHC, 
and the second is costimulation of the specific T cell 
recognizing the presented antigen. Consequently, for 
an adjuvant to effectively augment an immune re- 
sponse, not only must the antigen be delivered appro- 
priately to APCs, but it is also imperative that costim- 
ulation occurs. Dendritic cells mature and upregulate 
expression of key costimulatory molecules on their 
surface, thereby providing the important second sig- 
nal for T-cell activation. Activated mature DCs also 
secrete specific cytokines that orchestrate the immune 
response and direct the CD4* T cells toward a Th1- 
or Th2-type response (37). 

M. smithii archaeosomes provide a strong cos- 
timulation signal to macrophages (Fig. 7) and DCs 
(30). Conventional liposome composition (PC/PG/ 
cholesterol) provides little costimulation (Fig. 7), con- 
sistent with a need to incorporate a coadjuvant to im- 
prove its vaccine potential (48). The striking upregu- 
lation of MHC class II, CD80, and CD86 observed 
with M. smithii archaeosomes, compares favorably 
with that obtained with potent lipopolysaccharide 
from E. coli. A similar striking upregulation of expres- 
sion of costimulatory marker CD40 occurs in mouse 
bone marrow-derived DCs exposed to M. smithii ar- 
chaeosomes (24). Furthermore, DCs were activated by 
M. smithii archaeosomes to secrete moderate amounts 
of cytokines interleukin 12 (IL-12) and tumor necro- 
sis factor (TNF) (24, 30). Collectively, these studies 
highlight the value of archaeosomes when compared 
with nonarchaeal liposome preparations. 

In contrast to archaeosomes from M. smithii, 
there is relatively little known about costimulatory ef- 
fects of archaeosomes prepared from the total polar 
lipids of other archaea. However, it is known that 
costimulatory molecules are also upregulated in 
macrophages and DC cultures exposed to H. sali- 
narum or T. acidophilum archaeosomes (L. Krishnan 
and G. D. Sprott, unpublished data). Furthermore, 
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Figure 6. Structures and abundance of the polar lipids found in the total polar lipids of M. smithii (54). Phosphoserine head- 
groups are abundant in both archaeol (2,3-di-O-sn-phytanylglycerol) and its dimer, caldarchaeol, lipids. Other lipids present in 


minor amounts are not shown. 


many archaeosome types evoke moderate amounts 
of IL-12 secretion by APCs (62). As the headgroups 
greatly vary in these different archaea, this suggests 
that the core lipid structure that is common to ar- 
chaea may be important for the interaction of ar- 
chaeosomes with APCs. 


Presentation Pathway 


Antigen presentation may be assessed by an in 
vitro assay designed to monitor the extent to which 
ovalbumin-derived MHC class I epitope (SIINFEKL) 


is presented on the surface of an APC. For this to 
occur by the classical pathway, the ovalbumin en- 
trapped within archaeosomes must undergo endo- 
cytosis by the DC or macrophages culture and be 
delivered to the cytosol for processing in the protea- 
some. In brief, the major MHC class I peptide SIN- 
FEKL, so generated, may then be loaded on MHC 
class I molecules and the complex delivered to the cell 
surface for presentation to CD8* T cells. Thus the in 
vitro assay is based on incubating APCs with antigen 
entrapped in archaeosomes, allowing intracellular 
processing, and then incubating them with CD8* 
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Figure 7. Upregulation of cell surface molecules on J774A.1 macrophages treated with archaeosomes. Macrophages were 
treated for 24 h, with no activator, 25 ug of antigen-free liposomes ml~! (PC/PG/cholesterol), 25 wg of antigen-free archaeo- 
somes ml~!, or 10 pg of lipopolysaccharide ml~?. Cells were then double stained with Macla-PE (macrophages marker) and 
one of the cell surface markers shown. Data were acquired by flow cytometry. Reproduced from the Journal of Immunology 


(30) with permission of the publisher. 


T cells that have the specific T-cell receptor to bind 
the presented MHC class I-SIINFEKL epitope. Bind- 
ing activates the CD8* T cells to secrete the cytokine 
interferon-y (IFN-y). Quantification of IFN-y pro- 
duction by the T cells is a direct indication of the abil- 
ity of archaeosomes to deliver ovalbumin to the cy- 
tosol for MHC class I loading (Fig. 8). Synthetic 
SIINFEKL added to a separate aliquot of the APC 
culture will bind to MHC class I molecules exoge- 
nously and is equated to 100% loading. In this assay, 
soluble ovalbumin fails to activate T cells, as it cannot 
gain access to the MHC class I pathway, and serves as 
a negative control. 

The pathway of antigen presentation in DCs and 
macrophages has been assessed in some detail for 
ovalbumin entrapped in M. smithii archaeosomes 
and found to occur via a series of classical steps for 


both MHC class I and class II processing (Fig. 8). For 
M. smithii archaeosomes, the high content of arch- 
aetidylserine (phosphoserine-archaeol) and phospho- 
serine-caldarchaeol lipids indicates that the mech- 
anism of endocytosis may be promoted by a PS 
receptor. PS receptors are commonly found in DCs 
and macrophages. Their role in clearing cells under- 
going apoptosis has been studied in macrophages 
(25), but not their role in influencing antigen presen- 
tation. Several lines of evidence support a PS receptor- 
mediated endocytic mechanism. First, Annexin V la- 
beling experiments indicate that PS headgroups in 
M. smithii archaeosomes are surface exposed and 
available for receptor interaction (62). Second, cyto- 
chalasin inhibitors of phagocytosis prevent uptake of 
these archaeosomes. Third, phosphatidylserine lipo- 
somes compete for uptake of M. smithii archaeo- 
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Figure 8. Scheme depicting processing of ovalbumin encapsulated 
in archaeosomes of M. smithii by an APC. Sites of inhibitor action 
are shown. Specific peptide fragments of the Ag (ovalbumin) are 
presented by MHC class I or MHC class II to CD8* T cells and 
CD4* T cells, respectively. A CD8* T cell specific to the MHC 
class I ovalbumin peptide complex is shown docking with the APC. 
Docking results in activation and excretion of IL-2 and IFN-y. SR, 
signaling receptor; PR, phagocytosis receptor; R1, R2, and R3 are 
steps 1 to 3, the rate of which depend on the archaeosome compo- 
sition and APC. 


somes, whereas liposomes or archaeosomes lacking 
the phosphoserine headgroup are less effective (24, 
62). This is illustrated by the results of the assay just 
described that was designed to measure antigen pro- 
cessing by DCs incubated with ovalbumin-archaeo- 
somes (Fig. 9). Inhibition of antigen processing by 
either soluble phosphoserine or phosphatidylserine li- 
posomes indicates competition for the endocytosis of 
M. smithii archaeosomes. Specificity for the PS recep- 
tor was indicated by lack of inhibition upon block- 
ing either the mannose receptor by mannopentaose, 
or the Fc receptors by specific antibody (24). As the 
PS receptor is expressed primarily on APCs, engag- 
ing this receptor for antigen delivery may result in 
efficient antigen delivery for processing by the im- 
mune system. 

Inhibition of MHC class I processing by chloro- 
quine and monensin (inhibitors of acidification of the 
phagolysosome) indicates that acidification is a nec- 
essary event to promote the cytosolic translocation 
of the antigen from the phagolysosome (24). Inhibi- 
tion of proteases in the phagolysosome by leupeptin 
and antipain did not interfere in MHC class I pro- 
cessing, whereas inhibition of cytosolic proteasome 
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Figure 9. PS receptor-mediated endocytosis of M. smithii archaeo- 
somes containing ovalbumin. Processing of ovalbumin and pre- 
sentation of peptide by MHC class I is quantified by assay of IL-2 
or IFN-y by activated CD8 T cells, as described in the legend to 
Fig. 8. Competition for uptake of archaeosomes was assayed for 
soluble phospho-.-serine (Sol. PS), phosphatidylserine liposomes 
(PS lipo), or mannopentaose (MP), at the wg concentrations shown 
in parentheses. Reproduced from the Journal of Immunology (24) 
with permission of the publisher. 


processing by inhibitors such as lactacystin blocked 
MHC class I processing (Fig. 8). The implication of 
this finding is that the protein antigen must be 
processed for MHC class I presentation, subsequent 
to translocation to the cytosol. Furthermore, DCs 
from TAP (transporter associated with antigen pro- 
cessing) deficient mice did not process antigen carried 
in M. smithii archaeosomes, for MHC class I presen- 
tation (24). All of these findings support a classical 
MHC class I pathway for antigen entrapped in M. 
smithii archaeosomes. In addition, part of the antigen 
delivered was directed to MHC class II processing 
(24), indicating that MHC class I and II compete for 
antigen. Clearly, the properties of the specific archaeo- 
some type used in a vaccine may influence this compe- 
tition to favor either one or other pathway, and may be 
used to bias processing by either MHC class I (CD8* 
T-cell response) or class II (CD4* T-cell response). 


IMMUNE RESPONSES 
CD4* T-Cell Response 


Antigen presentation in the context of MHC 
class II results in interaction with CD4* T cells and, 
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depending on the cytokine environment, differenti- 
ate primarily toward either Th1 cells secreting type 1 
cytokines (cell-mediated response) or Th2 cells se- 
creting type 2 cytokines (antibody response). Initial 
studies revealed that a strong antiprotein antibody re- 
sponse developed in mice that had been immunized 
with various archaeosome types that contained a pro- 
tein antigen (63). This response was higher than that 
obtained using conventional liposomes. In fact, the 
adjuvant activity decreased as the proportion of con- 
ventional lipids to archaeosomes was increased (28). 
This may have been caused by the lower stability and 
leakage of entrapped antigen in vivo, as adjuvant ac- 
tivity demands that antigen be retained within the 
archaeosome/liposome for delivery, or loss of the co- 
stimulation signal. 

Further studies revealed that the archaeal lipid 
composition used to entrap the antigen has a marked 
effect on the antibody titers measured in the sera of 
immunized mice (Fig. 10). Highest mean titers (6 to 
8 mice/group) occurred with archaeosomes prepared 
from the total polar lipids of Halobacterium halo- 
bium and were significantly higher than all other ar- 
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chaeosome types at day 10 (P < 0.05). Following a 
second injection, the antibody mean titer for the H. 
halobium archaeosome adjuvant was not significantly 
different from those obtained for Halococcus morr- 
huae or Methanosarcina mazei archaeosomes by day 
49, but was significantly higher (P < 0.05) compared 
with either M. smithii or M. jannaschii. Significant 
differences were seen at 49 days between higher titers 
for M. mazei than for M. smithii, and for M. smithii 
than for M. jannaschii. It is tempting to link the some- 
what lower titers obtained for M. smithii, and espe- 
cially the low titers for M. jannaschii, with the higher 
archaeosome stability. Caldarchaeol lipids account 
for about 40 mol% of the polar lipids in M. smithii 
(54) and, in combination with macrocyclic archae- 
ols, about 85 mol% in M. jannaschii (when cells are 
grown at 65°C) (58). A higher stability and lower 
antigen release into the phagolysosome (Step R2 in 
Fig. 8) may account for the reduced antigenicity of 
archaeosomes from these archaea (Fig. 10). 
Structural differences in the total polar lipid 
headgroups present in the various archaeosome types 
may influence the magnitude of MHC class II re- 
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Figure 10. Influence of the polar lipid composition of archaeosomes on humoral adjuvant activity. Archaeosomes with en- 
trapped ovalbumin were prepared from the total polar lipids extracted from H. halobium (H.h), Halococcus morrhuae strains 
14039 and 16008, Methanosarcina mazei (M.m), M. smithii (M.s), and Methanococcus jannaschii (M.j). Injections, given 
subcutaneously at zero and 3 weeks, contained 15 ug ovalbumin. Titers of antiovalbumin antibody in sera are given for 10-day 
and 49-day bleeds. Titers in sera from mice immunized with 15 pg ovalbumin (no adjuvant) were below 10. Each data point 
is for a different mouse. Reproduced from Archaea (62) with permission of the publisher. 
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sponses. Headgroup structural features may be espe- 
cially important in receptor-mediated endocytosis and 
signaling mechanisms. Several lines of evidence illus- 
trate the importance of archaeal lipid headgroups to 
humoral adjuvant activity. Lipids from the extremely 
halophilic archaea (H. halobium and H. morrhuae 
strains 14039 and 16008) differ from the other ar- 
chaeosome types by having a high content of ar- 
chaetidylglycerol phosphate (PGP-CH;) and sulfated- 
glycolipids (62). N-Acetylgalactosamine—4-SO, is 
recognized by the mannose receptor of phagocytic 
cells (17), and sulfation of archaeal glycolipids may 
promote interaction with this or other receptors. An- 
tibody titers were high for M. mazei archaeosome 
adjuvant (Fig. 10), and these archaeosomes are com- 
posed almost exclusively of phospholipids. This sug- 
gests that in the presence of archaetidylserine or other 
bioactive phospholipids, archaeal glycolipids are not 
essential for an antibody response. M. jannaschii lipid 
extracts are high in glycolipids (the dominant sugar 
is N-acetylglucosamine) (62), and this archaeosome 
type produced a relatively weak antibody response, 
indicating that relatively low adjuvant activity is in- 
duced by this headgroup. Phagocytosis is promoted 
when conventional liposomes contain phosphatidyl- 
serine (34). Therefore, the presence of the phospho- 
serine archaeal lipids found in M. smithii, M. mazei, 
and M. jannaschii might account for some of the ad- 
juvanticity observed for these archaeosomes. 

Surprisingly, M. smithii archaeosomes induced 
both Thi (IFN-y) and Th2 (IL-4) cytokines in spleen 
cells from immunized mice, in contrast to conven- 
tional liposomes and Alum that induced only IL-4 
(28). The finding that this archaeosome was capable 
of a mixed adjuvant activity may be highly beneficial 
for the development of new vaccines. 


CD8* T-Cell Response 


A CD8* T-cell response is an adaptive immune 
response largely evolved for protection against viral or 
other intracellular infections. As exogenous antigens 
generally do not gain access to the cytosol of APCs for 
processing, they do not promote a CD8* T-cell re- 
sponse. However, a cytotoxic T-cell (CTL) response is 
required for vaccines to be protective against many 
diseases, to clear host cells infected by intracellular 
pathogens or in a cancerous state. Several adjuvant 
systems are being developed to try and alleviate this 
problem and to deliver antigen for MHC class I pro- 
cessing (5, 38). Archaeosomes may serve well for this 
purpose as they are potent CTL adjuvants for en- 
trapped protein and peptide antigens (29). 

M. smithii archaeosomes deliver protein antigens 
from their location in phagolysosomes to the cytoso- 


lic, classical processing pathway, and load MHC class 
I molecules with antigen-derived peptides. This pro- 
cess results in development and proliferation of cyto- 
toxic T cells capable of killing target cells that express 
these same peptide epitopes on their surface. Various 
types of archaeosomes (containing ovalbumin as the 
test antigen) have been compared as CTL adjuvants 
in mice. Although all the archaeosome types demon- 
strate potential to adjuvant a CTL response, a single 
subcutaneous injection produced a primary (10 day) 
CTL response that was most strong for H. halobium, 
H. morrhuae 14039, and M. smithii archaeosomes 
(Fig. 11). The polar lipids from H. morrhuae 14039 
are remarkably similar to strain 16008, except for the 
presence of an unidentified glycolipid in strain 14039 
(62). Archaeosomes from strain 14039 were superior 
to 16008 in promoting a CTL response and in acti- 
vating DCs to secrete IL-12, implicating the unknown 
glycolipid as an immune activator (62). 


Immune Memory 


To be effective, vaccine adjuvants must not only 
elicit strong primary immune responses to the pro- 
tective antigens but also achieve long-lasting immu- 
nity with strong memory recall. Most published stud- 
ies only focus on the short term. The importance of 
long-term studies was demonstrated recently using a 
liposome vaccine prepared from the total polar lipids 
of Marinococcus (Planococcus H8), a member of the 
Bacteria (56). This liposome type initially appeared to 
have promise as it activated DCs and produced a 
good primary CTL response, but unfortunately the 
response was only short lived. A liposome vaccine 
prepared from the total polar lipids of Bacillus firmus 
was a weak adjuvant for both antibody and CTL re- 
sponses (56), further highlighting the value of ar- 
chaeal lipids for vaccine applications. 

Archaeosomes containing bovine serum albumin 
as test antigen produced long-term antibody titers in 
mice (28). Two total polar lipid archaeosome types 
were tested. H. salinarum archaeosomes resulted in a 
strong primary response, long-lasting titers over the 
life of the mice, and memory recall. Such a long-last- 
ing response was not expected as these archaeosomes 
lack caldarchaeol lipids and was anticipated to con- 
tribute less antigen persistence than caldarchaeol-con- 
taining adjuvants. For T. acidophilum archaeosomes, 
the primary response was not as strong as for the H. 
salinarum archaeosomes, but titers were maintained, 
and a strong memory recall response occurred. Alum 
is the antibody-adjuvant approved for human use, 
and it produced a response comparable to H. sali- 
narum archaeosomes, although memory recall was 
not as strong (28). 
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Figure 11. Influence of the polar lipid composition on cytotoxic T-cell responses (CTL) to ovalbumin entrapped in archaeo- 
somes. C57BI/6 mice were immunized subcutaneously with 15 ug ovalbumin entrapped in various total polar lipid archaeo- 
somes. Ten days later, CTL activity was measured in splenic cell cultures. Killing of specific target cells (open symbols) by ef- 
fector cells and lack of killing of nonspecific targets (closed symbols) are shown. The ratio of effector to target cells is shown 
as the E:T ratio. Reproduced from Archaea (62) with permission of the publisher. 


When using lipid adjuvants, long-term CD8* 
T-cell responses may be more difficult to achieve than 
B-cell (antibody) memory responses (56). The longe- 
vity of CTL responses in mice immunized with archaeo- 
somes prepared from several archaea were examined 
over the long term (up to 50 to 65 weeks following 
vaccination) (32). In contrast to formulations of li- 
posomes, a long-term memory response was observed 
for M. smithii and T. acidophilum archaeosomes. CTL 
memory was less evident for the other archaeosome 
preparations, which all lack membrane-spanning 
lipids, and it is tempting to speculate that the caldar- 
chaeols promote these unusually long-lasting CTL re- 
sponses. However, in an apparent contradiction, cal- 
darchaeols and macrocyclic archaeols are present in 
M. jannaschii archaeosomes, and these did not con- 
tribute to long-term CTL responses. The relatively 
weak priming of immune responses by M. jannaschii 
archaeosomes may have contributed to the poorer 
long-term responses. Furthermore, the reason for an 
apparent absence of long-term CTL responses in mice 
immunized with some archaeosomes may relate to 
variable factors such as antigen loading, antigen dose, 
or vaccination regimen. 


VACCINES 


Data using test antigens indicate the strong po- 
tential for archaeosome vaccines to achieve immunity 
against intracellular bacteria. Because Listeria mono- 


cytogenes is an intracellular pathogen that is typically 
cleared by the host CD8* T-cell immunity (42), this 
model has been used as a test for potency of archaeo- 
some vaccines. The protective CTL epitope of the key 
immunogenic protein of Listeria, Listeriolysin, is 
known. This peptide was synthesized and coupled to 
palmitoyl chains to promote high-efficiency encapsu- 
lation (14). Mice immunized with the M. smithii vac- 
cine developed a strong CTL response that resulted in 
specific immunity following infection of the mice with 
L. monocytogenes. Protection did not occur with 
mice immunized with antigen only, or with archaeo- 
somes lacking antigen. Furthermore, prolonged pro- 
tection was demonstrated (Fig. 12). Similar protection 
was observed with T: acidophilum archaeosomes, 
somewhat less effective with H. salinarum archaeo- 
somes, and the least amount of protection with con- 
ventional liposomes (14). 

Archaeosomes also have potency and utility as 
adjuvants for cancer vaccines (31, 32), where a CD8+ 
T-cell response is required for protection or regression 
of tumors. Solid tumor and metastasis mouse models 
have been used to evaluate both the therapeutic and 
prophylactic utility of archaeosomes. Using ovalbu- 
min as the test antigen, effective protection of mice 
vaccinated with M. smithii archaeosomes was demon- 
strated in both solid tumor and metastasis models. 
Protection was dependent on the presence of CD8* 
T cells and required IFN-y (31). Protection also oc- 
curred in IL-12 knockout mice (31). These data indi- 
cated strong adjuvant activity of the M. smithii ar- 
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Figure 12. Protection of archaeosome-immunized mice against in- 
fection by the facultative intracellular bacterium, Listeria mono- 
cytogenes. BALB/c mice were vaccinated subcutaneously on days 0, 
21, and 42 with 12.5 ug of a dipalmitoylated peptide entrapped 
in M. smithii archaeosomes. After 3, 5, and 10 months, mice were 
challenged with live L. monocytogenes, and the bacteria in spleens 
were enumerated 3 days later. C, mice not immunized; V, vacci- 
nated mice. The number of spleens in each group of 5 mice that 
were still infected are shown as (x/5). Modified from Vaccine (14) 
with permission of the publisher, with the exception of the 10- 
month data (J. W. Conlan, unpublished data). 


chaeosomes (31). Furthermore, in therapeutic tumor 
models, solid tumors regressed following the injection 
of M. smithii archaeosomes loaded with antigen. Some 
regression also occurred with injection of antigen- 
free M. smithii archaeosomes, but not with liposomes 
made from phosphatidylcholine, phosphatidylglyc- 
erol, and cholesterol (31). T. acidophilum antigen-free 
archaeosomes also had tumor-regressing properties, 
but H. salinarum archaeosomes did not (32). This in- 
triguing effect of the M. smithii archaeosomes might 
relate to the high content of archaetidylserine. This 
is consistent with studies showing that liposomes con- 
taining phosphatidylserine damage the vascular sys- 
tem of tumors (9). However, this mechanism of tumor 
regression would not explain how the T. acidophilum 
archaeosomes exert their effect, as archaetidylserine 
is absent. 


SAFETY 


Archaea are nonpathogenic organisms (6, 22), 
not associated with endotoxin or other toxic metabo- 
lites, and it may therefore be anticipated that ar- 
chaeosomes would be safe. The main polar lipid of 
T. acidophilum (64) has been administered to mice by 
intraperitoneal and oral routes and found to be free 
of toxicity (18). Furthermore, a series of total polar 
lipid archaeosome compositions were tested to as- 


sess the possibility of toxicity related to their use in 
mammals (43, 46). In mice, significant levels of anti- 
archaeosome antibodies were not detected, and reac- 
tions or toxicity were absent in a standard series of 
toxicity tests (46). A similar conclusion was reached 
in studies using mice that were administered a high 
intravenous or oral dose of archaeosomes (43). In tri- 
als using hundreds of mice and various strains, ages, 
and sex, effects of toxicity have never been docu- 
mented, suggesting that archaeosomes will prove to 
be a safe adjuvant suitable for trials in humans. 


PERSPECTIVE: THE NEXT FIVE YEARS 


The utility of archaeal polar lipids as adjuvants 
for mammalian vaccines is now well established. The 
stable physical properties of the polar lipids, the hy- 
dration properties of the lipids that result in archaeo- 
some formation, the stability of archaeosomes, and 
the specific interactions that occur with APCs all con- 
tribute to their adjuvant potential. Patents are issued in 
various countries, including PCT applications covering 
the preparation of archaeosomes from total polar 
lipids of archaea, to those describing their adjuvant 
properties (59, 61). The intellectual property on adju- 
vant activity includes archaeosomes prepared from pu- 
rified or synthetic archaeal lipids, and archaeal lipids 
mixed with conventional lipids. Although animal trials 
indicate that archaeosomes are safe with no toxicity 
issues, human clinical trials have yet to commence. 

Archaeosomes act as mixed adjuvants augment- 
ing Th1 and Th2 arms of immunity, as well as the 
CTL response. Delivery of antigen for presentation by 
MHC class I and MHC class II are competing path- 
ways that are favored depending on the lipid compo- 
sition of archaeosomes. For example, the total polar 
lipids from H. salinarum favor MHC class II presen- 
tation and antibody response, whereas the polar 
lipids from M. smithii and T. acidophilum favor 
MHC class I presentation and the CTL response. Fur- 
ther studies directed toward understanding archaeo- 
some structure in relation to antigen presentation 
may allow design of an archaeosome type to meet the 
needs of a particular vaccine. 

Archaeosomes are proving to be a valuable tool 
to study the interaction of lipid adjuvants with the 
mammalian immune system. Over the next several 
years we can anticipate that researchers will address 
many of the questions required to explain fully the 
mechanism of adjuvant action. To date, most of the 
published mechanistic work has used archaeosomes 
prepared from the total polar lipids of M. smithii, 
which are notable in their high content of phospho- 
serine lipids. It has not yet been determined whether 
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other archaeosome compositions also interact with 
APCs to use the classical pathways of antigen presen- 
tation. While it is known that various archaeosome 
lipid compositions deliver protein antigen to the 
cytosol of APCs for MHC class I presentation, the 
mechanism of endosome to cytosol antigen transloca- 
tion is unknown. This may occur by fusion of ar- 
chaeosomes with the phagolysosome membrane trig- 
gered by the low pH and/or higher calcium content 
found in these compartments (13), or destabilization 
and leakage to the cytosol. With the exception of one 
study showing that nonphysiological amounts of cal- 
cium phosphate can promote slow fusion, little is 
known about membrane fusion events of archaeo- 
somes, especially those high in caldarchaeol lipids (15). 

M. smithii archaeosomes interact with the PS re- 
ceptor on APCs and adjuvant the immune response 
(24) However, this observation is difficult to rational- 
ize in view of the down-regulation of the immune re- 
sponse that occurs in DCs exposed to liposomes 
containing phosphatidylserine (8). M. smithii archaeo- 
somes elicit a counterintuitive effect that includes ca- 
pacity to stimulate allogeneic T-cell proliferation, to 
activate CD4* T cells, and to upregulate costimula- 
tory molecules. The stereochemistry of archaeal lipids, 
or some other unique archaeal lipid feature, may ex- 
plain this apparent contradiction. 

Adjuvant development has centered on discovery 
of Toll-like receptor agonists to induce an inflamma- 
tory response by alerting the immune system to the 
presence of “danger signals” normally expressed by 
pathogens. Most archaeosomes are only mildly in- 
flammatory (24, 29, 30, 57, 62), indicating that 
strong adjuvant activity need not necessitate the side 
reactions associated with strong inflammation. In- 
deed it has been suggested that overt inflammation 
may erode memory cells (56). Much remains un- 
known regarding the APC receptors and signaling 
pathways that may be activated by various archaeo- 
some lipid compositions, or the relationship between 
adjuvant activity and inflammation. 

M. smithii and T. acidophilum archaeosome lipid 
compositions have the unusual capacity to induce a 
long-lasting CD8* T-cell memory response in animals 
(29, 32). At present the mechanism is largely un- 
known but may relate to the induction of specific T- 
cell subsets during a strong primary response and/or 
to archaeosome-antigen persistence. The membrane- 
spanning caldarchaeol lipids are suspected to be re- 
quired for this phenomenon. Furthermore, it is cur- 
rently unknown whether the apparent inability of 
some archaeosome compositions to maintain a very 
long-lasting CD8* T-cell response in animals may be 
compensated for by varying parameters such as the 
antigen dose or injection schedule. 


Antigen must be physically associated with the 
archaeosome for adjuvant activity to occur (28). 
However, studies have not compared antigen that are 
internalized versus being coupled to the surface of ar- 
chaeosomes. Surface coupling may prove to be at- 
tractive for peptides in cancer vaccine applications 
(50), and more work is required in the area of anti- 
gen-archaeosome formulation. 

To achieve the ultimate goal of an “optimized ar- 
chaeosome” tailored to elicit the immune responses 
required for a particular vaccine, it may be necessary 
to use purified or synthetic archaeal lipids. This is in- 
dicated from observations that specific archaeal, and 
nonarchaeal, lipid structures have particular receptor 
recognition and APC-activating properties, whereas 
others do not (24, 56, 62). However, utility in animals 
of total polar lipid archaeosome vaccines has been 
demonstrated, and this approach is likely to have ad- 
vantages of less cost and multiple receptor interac- 
tions and to contribute to the formation of stable vac- 
cines. Success of archaeosomes as a human adjuvant 
now depends on successful clinical testing of an ar- 
chaeosome vaccine. 
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Figure 7 (Chapter 2). The euryarchaeon SM1 and its extracellular appendages (“hami”). (A) Electron micrograph of a “hamus.” (B) Enlargement of the 
hook region. (C) Simplified model of a hamus with the three filaments shown in different colors and 3D reconstruction from cryoelectron microscopy. (D) 
“String of pearls,” archaeal/bacterial community in cold, sulfurous spring water. (E) Hamus model with dimensions. (F) Natural biofilm hybridized with an 
SM1-specific fluorescent probe; circle diameter, 4 wm. (G) Pt-shadowed electron micrograph of a single SM1 cell with appendages. Figure compiled, modi- 
fied, and reproduced from Biospektrum (264) and Molecular Microbiology (265) with permission of the publishers. 
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Figure 8 (Chapter 2). Three-dimensional structures of archaeal Argonaute proteins. (A) 3D structure of P. furiosus Ago with 
the PAZ domain (blue) and the PIWI domain (green/yellow) (PDB code 1U04). (B, C) Similarity of the P. furiosus PIWI do- 
main (B) with the catalytic core of the E. coli RNase H1 (C) (PDB code 1RDD) with the catalytic DDE triad and bound Mg** 
ion highlighted. The P. furiosus PIWI domain has a putative, similar catalytic DDE triad and a conserved Arg (position 627). 
(D, E) 3D structure of the P. furiosus PAZ domain (D) and comparison with the homologous domain of human Ago1 bound 
to an siRNA mimic (E) (PDB code 1SI3). (F) Domain structures of Ago proteins, including N-terminal, linker (L1 and L2), PAZ, 
Mid, and PIWI domains and of human dicer comprising a DEXH helicase, a PAZ, two RNase III, and dsRBD domains and a 
conserved domain of unknown function (DUF). Panels A to E reproduced with modifications from Current Opinion in Struc- 
tural Biology (241) with permission of the publisher. 
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Figure 10 (Chapter 2). Three-dimensional structure of the 50S subunit of the Haloarcula marismortui ribosome. The ribo- 
some arm around ribosomal protein L1 was omitted (for a more complete picture see reference 200). Figure drawn from the 
coordinates from PDB entry 1QVF (200); ribosomal RNAs are displayed in red (backbone) and gray (bases), proteins are dis- 
played as yellow backbone ribbons. Top left, crown view; top right, back view; bottom, bottom view; the circle indicates the 
position of the polypeptide exit tunnel. 


Figure 11 (Chapter 2). Haloarchaea in liquid cultures and within 
salt crystals. (A) Cultures of Haloferax and Halorubrum: first flask 
(front), H. volcanii WFD11 wild type; second flask, H. volcanii 
WFD11 gas vesicle AD mutant (see Fig. 14); third flask, H. volcanii 
WFD11 gas vesicle AD mutant complemented with the gupD gene; 
fourth flask, Halorubrum vacuolatum wild type. (B) Himalayan 
rock salt (“Eubiona”; Claus, GmbH, Baden-Baden, Germany). (C, 
D) Crystals formed from dried Halobacterium cultures (cells 
trapped within). Bars, 1 cm. Crystals courtesy of F. Pfeifer, Darm- 
stadt, Germany. Photographs by F. Pfeifer, Darmstadt, Germany 
(panel A), and A. Kletzin (panels B to D). 


Figure 15 (Chapter 2). Solfatara and Pisciarelli fumaroles. (Left) Fumaroles in the Solfatara caldera (Pozzuoli near Naples, Italy) 
with deposition of sulfur, mercury, and arsenic salts. (Right) Fumarole-heated hole with boiling water, typical of habitats for 
Sulfolobales (Pisciarelli, near Naples, Italy). Photos taken by A. Kletzin. 


Figure 18 (Chapter 2). Three-dimensional structures of tungsten-containing aldehyde:ferredoxin oxidoreductases from Pyro- 
coccus furiosus. (A) Cartoon of the formaldehyde:ferredoxin oxidoreductase (FOR), homotetrameric holoenzyme (150). 
(B) Cartoon of the aldehyde:ferredoxin oxidoreductase (AOR) homodimeric holoenzyme (53). (C) Peptide chains of AOR 
(cyan) superimposed on FOR (magenta) showing close structural similarity (150). (D) Active-site cavity of the FOR with sur- 
rounding residues and glutarate shown (150). (E) [4Fe-4S] cluster and the W-(bis-tungstopterin) cofactor of the AOR (53). FOR 
images reproduced from the Journal of Molecular Biology with permission of the publisher (150); AOR images reproduced 
from Science (53) with permission of the publisher. 


External E at 


Re È 
wt 


Figure 19 (Chapter 2). Model of the Aeropyrum voltage-gated K*-channel KvAP and comparison with the Streptomyces livi- 
dans KcsA K* channel. (A) Stereo view of the KvAP pore with electron density map contoured at 1.0 A_ carbon (yellow), ni- 
trogen (blue), oxygen (red), potassium (green). (B, C) a-Carbon traces of the KvAP pore (blue) and the Streptomyces lividans 
KcsA K* channel (green) shown as a side view (B) and end-on from the intracellular side (C); S5, S6, outer and inner helices; 
glycine-gating hinges (red spheres). (D, E) Models of the closed (D) and open (E) KvAP structures based on the positions of 
the paddles (red), the pore and the S5 and S6 helices of KcsA. Reproduced with modifications from Nature (168, 169) with per- 
mission of the publisher. 


Figure 20 (Chapter 2). Electron micrograph and fluorescence images of Ignicoccus and Nanoarchaeum. (A) Transmission 
electron micrograph of thin-sectioned Ignicoccus cell with broad periplasmic space (P) and budded vesicles; OM, outer mem- 
brane, C, cytoplasm, bar, 1 um. (B) Negative stained Ignicoccus outer membrane, highlighting power spectra of image field 
(C to E) (275). Panels A to E reproduced from Biochemical Society Symposia (275) with permission of the publisher. (F) Ul- 
trathin section of Nanoarchaeum cells attached to the outer membrane of Ignicoccus sp. KIN/4. (G) Platinum shadowing of Ig- 
nicoccus cell with several Nanoarchaeum cells attached (left side of photograph). (H) Confocal laser-scanning micrograph us- 
ing Nanoarchaeum (red) and Ignicoccus-specific probes (green). Panels F to H reproduced from Nature (152) with permission 
of the publisher. 
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Figure 23 (Chapter 2). Three-dimensional structures of the T. aci- 
dophilum proteasome and tricorn protease. (A) Side view of the 
26S proteasome/activator particle with the two sets of seven ter- 
minal PA26 subunits and the two a7B7B7a7 rings (PDB code 1YA7) 
(93). (B) Top view of the 20S proteasome core particle showing 
the sevenfold symmetry (PDB code 1PMA) (245). (C) Top view of 
the homohexameric tricorn protease complexed with a tride- 
cameric peptide derivative (PDB code 1N6E) (195). 
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Figure 27 (Chapter 2). Three-dimensional structure of the A. ambivalens sulfur oxygenase reductase. (A) The SOR holoenzyme. 
Cartoon representation viewed along the crystallographic fourfold axis; cyan, a-helices; purple, B-sheets; red spheres, Fe ions. 
(B) Molecular accessible surface representation in the same orientation of inner surface of the sphere, color-coded according 
to the calculated electrostatic potentials: red, <—10 + 1 kT/e; white, neutral; blue, =+10 +1 KT/e. (C) Cavity surface repre- 
sentation of the catalytic pocket, with conserved cysteines and iron highlighted; gray arrow, cavity entrance. (D) Effect of mu- 
tants on SOR activity; +, zero activity; | reduced activity; 4 strongly reduced activity. The core active site composed of the Fe 
site and the persulfide-modified Css31 is highlighted within ellipsoids. Reproduced with minor modifications from Science (411) 
with permission from the publisher. 


Figure 28 (Chapter 2). Canonical respiratory chain in bacteria and mitochondria. Scheme based on 3D structures with the ex- 
ception of the membrane domain of complex I, for which a structure is not available. Domains that have not been identified 
in Archaea are shown in black. PP, periplasm; CM, cytoplasmic membrane; CP, cytoplasm; Q, quinols/quinones. The figure was 
prepared from the coordinates of PDB entries 1FUG (complex I, Thermus thermophilus), 1NEK (complex II, E. coli), 1KYO 
(complex III, Saccharomyces cerevisiae), 1EHK (complex IV, Thermus thermophilus), and 2CCY (cytochrome c, Rhodospiril- 
lum molischianum). 
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Figure 2 (Chapter 4). Sequences and structures of representative archaeal chromatin proteins. Primary sequences of HMfB from 
M. fervidus (A), Sul7d (Sac7d) from S. acidocaldarius (B), Alba (Sso10b1) from S. solfataricus (C), and MC1 from 
Methanosarcina sp. CHTISS (D) are shown below the corresponding protein structure. The figure was constructed using struc- 
tures available from the Protein Data Bank (11). Regions with a-helical and B-strand structures are colored identically in the 
sequence and in the corresponding structure. 
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Figure 3 (Chapter 6). Subunit structure of RNAPs from the three domains of life. The largest subunit in the Eucarya and B’ in 
the Bacteria is split into two subunits, A1 and A2, in the Archaea. In methanogens, subunit B is also split into two polypeptides, 
B’ and B”. Different parts of bacterial subunit a are encoded by the genes for the archaeal subunits D and L. Subunits E1, F 
H, N, and P are only shared between the Archaea and Eucarya. The pattern shown is based on separation of subunits by 
polyacrylamide gel electrophoresis under denaturing conditions. The numbers in the subunits of the eucaryal RNAP A (I), B 
(II), and C (III) indicate the molecular mass. 


Figure 4 (Chapter 6). Structural similarity of Pyrococcus RNAP (A) and yeast RNAPII (B). Comparison of interactions of an 
archaeal RNAP inferred from Far-Western analysis with interactions of yeast RNAPII observed in the crystal structure of the 
enzyme. The width of the lines connecting subunits is a measure of the intensity of the interaction. Modified from Science 
(27) with additional data from Proceedings of the National Academy of Sciences USA (2). 
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Figure 3 (Chapter 8). Organization of the main ribosomal protein gene clusters in archaeal genomes. SSO, Sulfolobus solfataricus; STO, Sulfolobus tokodaii; 


AFU, Archaeoglobus fulgidus; APE, Aeropyrum pernix; PFU, Pyrococcus furiosus; PHO, Pyrococcus horikoshii; PAB, Pyrococcus abyssi; TKO, Thermo- 
of the same genes in E. coli that is also present in most bacteria. Genes that are within 50 bp of each other, and may therefore be cotranscribed, are indicated 


MTH, Methanothermobacter thermautotrophicus; MJA, Methanococcus jannaschii; MMP, Methanococcus maripaludis; HMA, Haloarcula marismortui; H- 
sp, Halobacterium sp. NRC1; TAC, Thermoplasma acidophilum; TVO, Thermoplasma volcanii. The last line (ECO) shows for comparison the organization 


coccus kodakaraensis; PAE, Pyrobaculum aerophylum; MKA, Methanopyrus kandleri; MMA, Methanosarcina mazei; MAC, Methanosarcina acetivorans; 


specific genes are underlined. 


in the same color. Domain 


M. thermautotrophicus GatDE:tRNA°”" Complex 


GatD (Asparaginase subunit) 


Asparaginase active site 
(Gin + H,O — Glu + NH, 


GatE (Amidotransferase subunit) 


tRNA°* Amidotransferase Active Site 


(Glu-tRNA™ + NH, — Gin-tRNA“ + H,O) 


Figure 4 (Chapter 9). Crystal structure of the M. thermautotrophicus GatDE complexed with tRNAS". The dimer of the het- 
erodimeric GatDE (thus forming a heterotetramer) binds two tRNA molecules. The asparaginase active site of GatD and the 
kinase/amidotransferase active site of GatE are distantly separated, connected with a “molecular tunnel.” 
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Figure 3 (Chapter 10). Structure of archaeal group II chaperonin from Thermoplasma acidophilum. (A and B) The side view 
and top view of the crystal structure of T. acidophilum chaperonin, respectively. The a subunits are shown in dark green, and 
the B subunits are shown in dark blue. The hexadecameric structure was drawn using MOLSCRIPT (48). (C) The subunit struc- 
ture of T. acidophilum chaperonin. Apical, intermediate, and equatorial domains are represented by green, blue, and red, re- 
spectively. The helical protrusion is highlighted by yellow. The figure was drawn with the Viewer Light 5.0 software (Accelrys). 


Figure 16 (Chapter 14). Transmission electron micrographs of P. occultum. (a) Pyrodictium cells after freeze etching, exhibit- 
ing an S-layer with p6 symmetry. Bar = 0.5 wm. (b) Extracellular cannulae, a specific feature of the genus Pyrodictium; nega- 


tive staining with uranyl acetate; bar = 0.1 um. (c) 3D tomogram of a frozen-hydrated Pyrodictium cell with two cannulae; bar 
= 0.2 pm. Modified from the Journal of Structural Biology (106) with permission of the publisher. 


Figure 3 (Chapter 17). Crystal structure of the M. jannaschii Sec61 protein-conducting channel. Views from the top (a) and the front (b). Faces of the helices 
that form the signal-sequence-binding site and the lateral gate through which TMs of nascent membrane proteins exit the channel into lipid are colored. The 
plug, which gates the pore, is green. The hydrophobic core of the signal sequence probably forms a helix, modeled as a magenta cylinder, which intercalates 
between TM2b and TM7 above the plug. Intercalation requires opening the front surface as indicated by the broken arrows, with the hinge for the motion 
being the loop between TMS and TM6 at the back of the molecule (5/6 hinge). A solid arrow pointing to the magenta circle in the top view indicates schemat- 


ically how a TM of a nascent membrane protein would exit the channel into lipid. Structure and legend reprinted from Nature (107) with permission from 
the publisher. 
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Figure 8 (Chapter 18). Overview of halobacterial signal transduction. Transducer proteins (Htr proteins) are depicted as dimers 
(brown) and shown in their expected topology. The Htr regions involved in adaptation (yellow) and in signal relay (dark 
gray) to the flagellar motor via various Che proteins are indicated. The actions of the Che-protein machinery are illustrated 
for only one of the Htr proteins, shown on the left, for which an interaction with a substrate-loaded, membrane-anchored bind- 
ing protein is indicated. CheD and CheJ (CheC) proteins are omitted for clarity. Htr1 and Htr2 transduce light signals via di- 
rect interaction with their corresponding receptors SRI and SRII. Repellent light signals mediated by SRI and SRII elicit the 
release of switch factor fumarate from a membrane-bound fumarate pool. MpcT senses changes in membrane potential (AY) 
generated via light-dependent changes in ion transport activity of BR and HR. The relative sizes of receptors, binding pro- 
teins, transducers, and Che proteins approximately reflect their corresponding molecular masses. Reproduced from Molecu- 
lar Microbiology (54) with permission of the publisher and D. Oesterhelt. 


Figure 5 (Chapter 23). Uptake of fluorescent archaeosomes by 
phagocytic cells. Archaeosomes composed of total polar lipids were 
prepared either by incorporating a small amount of the fluores- 
cent lipid rhodamine-phosphatidylethanolamine (62) or by entrap- 
ping 1.5 mM carboxyfluorescein (66). Uptake was performed in 1 
ml of RPMI medium added to 0.5 million adhered cells (62). Pan- 
els show 30-min uptakes: (A) M. smithii rhodamine-archaeosomes 
(100 wg) by thioglycollate-activated mouse peritoneal macro- 
phages; (B) uptake of M. smithii rnodamine-archaeosomes (25 ug) 
by bone marrow-derived DCs; (C) uptake of M. mazei carboxy- 
fluorescein-archaeosomes (40 ug) by macrophages culture J774A.1; 
and (D) uptake of H. salinarum rhodamine-archaeosomes (100 pg) 
by thioglycollate-activated mouse peritoneal macrophages. 
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CH3-THMPT synthesis in, 296 
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reactions leading to methane in, 294-295 
regulation of genes in, 302-303 
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sulfur oxidation in, 65-66 
sulfur oxygenase reductase, structure of, 66-67, 
color illustration for Figure 27 (Chapter 2) 
sulfur reduction in, 65 
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Archaeocins, 40 
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methanogenesis for, 54 
S layers of, 322 
Archaeoglobus fulgidus, 244, 245, 270, 360, 424, 455 
regulation of heat shock response in, 451-452 
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safety of, 507 
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512 INDEX 


immune system, 507-508 
as antigen carriers, 499-500 
caldarchaeol content of, 498-499 
costimulatory effects of, 500-501, 502 
fluorescent, uptake by phagocytic cells, 500, color 
illustration for Figure 5 (Chapter 23) 
immune memory of, 505-506 
immune responses and, 503-506 
membrane structures in, 497-498 
polar lipid composition of, effect on humoral adjuvant 
activity, 504-505 
preparation of, 496-497 
presentation pathways of, 501-503 
proton and glycerol permeability of, 498-499 
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Argonaute proteins, archaeal, structure of, 27, 28, 
color illustration for Figure 8 (Chapter 2) 
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ATP-binding cassette transporters, 357, 359-363 
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TatA homolog of, 380 
Bacteria, characterization by rRNA method, 10-11 
identification of, 10-11 
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and light-driven ATP synthesis, 37-38 
BasB, 225 
bc1 complex, 70 
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Biochemistry, comparative, 6-7 
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future goals of, 11 
in conceptual crisis, 9 
Biopolymers, 488 
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Biotechnology, 478-495 
Bradyrhizobium, heat shock genes in, 151 
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binding to, and stabilization by, 233 
Caldarchaeols, in archaeosomes, 498-499 
Car, 225 
Carbon cycle, global, 288, 289 
Carbon dioxide, and methane, dismutation of methanol 
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Carbon dioxide fixation, 274-275 
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oxidoreductase system in, 297, 298 
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whole, biotechnology of, 489-490 
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sHsp and Hsp60, effects of, 214 
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subunits encoded per genome, 214, 215 
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che gene clusters, 398, 399-400 
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flagellation and, 385-410 
Chemotaxis gene families, 398, 401 
Chemotaxis genes, 398-400 
Chemotaxis transducers, 401-402 
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in transcription and DNA metabolism, 116-117 
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structure and evolution of, future investigation of, 
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oxidative, 273, 274 
reductive, 273-275 
regulation of, 276 
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Clamps, sliding, 100, 101 
Cleavage induction factor TFS, 142 
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Repeats clusters, 420 
Codons, nonsense, method for recoding, 30-31 
Coenzymes, in methanogenic pathways, 292, 293 
Cofactors, in methanogenic pathways, 292, 293 
Communication systems, cell-cell, 225 
Compatible solutes, 488-489 
Conjugation, 131-132, 468 
CosB, 225 
Costimulation, 500-501 
Crenarchaeota, 14, 16, 242-243, 420, 427, 429 
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transmission electron micrograph of, 319 
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S layers of, 321 
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(Chapter 2) 
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Dimethylallyl diphosphate, 342 
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152, 153 
organization and abundance of, 93, 95 
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EFP, in first peptide bond, 191 
size of, 191 
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elF6, 189-190 
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Electron crystallography, for study of S-layer sheets, 316 
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Endosymbiosis hypothesis, eucaryal cells and, 
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functions in, 224, 225 
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Methanosarcina acetivorans, 298-299, 300 
in Methanosarcina barkeri, 297, 299 
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Ferroplasma acidarmanus, 61 
Ferroplasma acidophilum, 61 
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F420H2 : CoMS-SCoB oxidoreductase system, in 
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glycosylation genes in, 393-394 
preflagellin peptidase genes in, 392 
regulation of, 394-395 
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Fumarate, 234-235 
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color illustration for Figure 15 (Chapter 2) 
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GC skew analysis, 414-415, 416 
Gellan gum (Gelrite), 464 
Gene clusters, 412-413 

ribosomal protein, in genomes, 182, 183, 
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Gene reporters, 471-473 
Gene silencing, posttranscription, archaeal, 27 
Gene transfer methods, 465 
Gene transfer systems, 468 
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Genome stability, and evolution, mechanisms of, 120-138 
homologous recombination and, 134 
Genomes, evolution of, gene loss and acquisition in, 422-423 
mechanisms of, 420-423 
rapid, 423 
recombination induced by transposition, 422 
general features of, 412-414 
homologs encoded by, 413 
in organization of translation components, 182-185 
rearrangements of, replication-driven, 420-422 
repeated sequences in, 419-420 
replication origin of, identification of, 415, 417 
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illustration for Figure 3 (Chapter 8) 
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182-185, color illustration for Figure 3 (Chapter 8) 
sizes of, 412 
and repertoires of protein chaperones, 209-210 
structure and evolution of, 411-433 
“whole genome trees” and, 427 
Genomics, amino acid composition and, 425 
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comparative, insights into history from, 426-428 
protein content and, 423-425 
functional, 434-462 
choice of model organisms for, 434-437 
merits and challenges in, 436, 437 
nucleotide composition and codon index in, 425-426 
structural, 452-455 
archaeal targets of, 453-455 
description of, 452-453 
Geranylgeranylglyceryl phosphate synthase, 346, 348 
Giardia lamblia, mRNAs as leaderless in, 186 
Gln-tRNA, mechanism of formation of, 204-206 
D-Gluconate dehydratase, 239 
Gluconeogenesis, 269-270 
Gluconeogenic pathway, reactions of, 269 
Glucose, catabolism of, 260-271 
labeling of pyruvate during, 261 
metabolism of, pathways of, 261, 262 
Glutamate dehydrogenase, 235 
Glutamate transporter, 358, 359 
Glutaminylglycan, in natronococci, 331-332 
Glyceraldehyde 3-phosphate dehydrogenase, 235 
sn-Glycerol-1-phosphate dehydrogenase, 348 
Glycerol phosphate backbone, linking of, 346 
synthesis of, 345, 346 
Glycoproteins, detection of, immunoblot for, 325, 326 
S layers of, 22 
self-assembling S-layer, 489 
Glycosidases, 482-485 
8-Glycosidases, reaction mechanism of, 483 
Glycoside hydrolases, 482-485 
Glycosylation, in Methanococcales, 325-326, 327 
Glycosylation genes, in flagellation, 393-394 
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Haloarchaea, 14, 424-425 
in liquid cultures and salt crystals, 33, color illustration 
for Figure 11 (Chapter 2) 
morphology of, 33-37 
production of specialized organelles in, 149 
single GVP cluster of, in genome, 38, 39 
Haloarcula, glucose metabolism in, 263-264 
Haloarcula marismortui, 180, 244, 479 
ribosome of, structural studies of, 32-33 
structure of 50S subunit of, 29, 30, color illustration 
for Figure 10 (Chapter 2) 
Halobacterial signal transduction, 404, 405, color 
illustration for Figure 8 (Chapter 18) 
Halobacteriales, 427-428 
habitats of, 33 
halophilic, characteristics of, 33, 34-35 
Halobacterium, 377, 419 
binding proteins of, 362 
glucose metabolism in, 263-264 
NRC-1, transcriptional analysis of, 451 
Halobacterium halobium. See Halobacterium salinarum 
Halobacterium NRC1, 437 
Halobacterium NRC-1, heat shock response regulation 
and, 150-151 
Halobacterium salinarum, 123-124, 225, 231, 234, 235, 
358, 472, 479, 328-329, 330, 363, 370, 385, 386, 
397-398, 400, 402-404, 467 
flagellum biosynthesis in, 24 
flagellar filament of, density map of, 386, 388 
glycan moieties of S-layer glycoprotein of, 329, 331 
proteomic analysis of, 441-442 
synthesis of membrane protein BR by, 37 
synthesis of retinal proteins by, 37-38 
Halobacterium vallismortis, 234 
Halocin immunity, mechanisms of, 40 
Halococcus, glucose metabolism in, 263-264 
Haloferax, glucose metabolism in, 263-264 
Haloferax mediterranei, 479 
acidic heteropolysaccharide from, 488 
Haloferax volcanii, 131, 132, 329-330, 360, 363, 376, 
377, 471-472 
gene clusters encoding, 278, 279 
transcriptional analysis of, 451 
Halophiles, extreme, bacterial metabolism in, 32 
Halophiles, neutrophilic, sulfated proteoglycan-like 
S layers in, 328-331 
proteomics of, 441-442 
transcriptional response analysis of, 450-451 
Halophilic archaea, internal salt concentrations of, 32 
Halophilic microorganisms, and habitats, 31-32 
Haloquadratum walsbyi, 33-37 
H3 : CoMS-SCoB oxidoreductase system, in carbon 
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dioxide-reducing species, 297, 298 
in Methanosarcina, 296-297 
Heat shock promoters, interaction of Pyrococcus heat 
shock regulator with, 149, 151 
Heat shock proteins, occurrence in three domains, 210 
small, 212-213 
thermotolerance and, 210 
Heat shock response, regulation of, 149-151 
Helical protrusion, in group II chaperonins, 217-218 
Hemicellulases, 482-483 
Heteropolysaccharide, from Haloferax mediterranei, 488 
His-Asp phosphorelay systems, 241-244 
distribution of, 241 
distribution of ORFs encoding potential in, 242-243 
physiological roles of, 243 
transduction cascades, architecture of, 241, 242 
Histones, 111-114 
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discovery of, 111 
eucaryal, and chromatin, 111 
Homologous recombination, 129-130 
DNA transfer with, 131 
genome stability and, 134 
Horizontal gene transfer, 412, 423, 424, 427, 429-430 
HslU proteases, origin of, 211 
Hsp60, effects of, 214 
HtpX, 211 
Htrll, 225 
HU-related chromatin proteins, 116 
Hydrogen production, 489-490 
Hydrolases, 479 
Hydroxymethylglutaryl-CoA reductase, 21-22 
Hyperosmolarity, adaptation to, 32-33 
Hyperthermophiles, 18 
amino acids in gene coding sequences from, 425 
definition of, 40 
environmental adaptation by, 209 
glucose metabolism in, 265-266 
growth of, temperatures for, 209-223 
Hyperthermophilic archaea, 40-49 
metabolism of sulfur and inorganic sulfur compounds 
by, 41-49 
Hyperthermophilic microorganisms, and habitats, 40-49 
Hyperthermophilic origin of life hypothesis, arguments 
against, 18-19 
Hypusination, 245 
Hypusine, chemical structure of, 29 


IF2-like translation initiation factor, structure of, 189 
Ignicoccus, 315 
fluorescence images of, 57, color illustration for 
Figure 20 (Chapter 2) 
outer membrane of, 333-334 
Ignicoccus sp. KIN4/I, 57 
Initiation factor 2, 238-239 
Inositol phosphates, 233 
Insertion sequences, 419-420 
in hyperthermophile genomes, 124-125 
Inteins, 486-488 
self-catalytic protein-splicing mechanism of, 487 


Introns, group II, 160-161 
in transcripts, 160-162 
in tRNA, rRNA, and mRNA genes, 161 
interrupting continuity of tRNA genes, 199 
Isopentenyl diphosphate, 342 
Isoprenoid building blocks, synthesis of, 342 
Isoprenoid side chains, elongation of, 342-346 
variation in length of, 347-348 


25-kDa protein, 189-190 
7kMk, 115-116 

Korarchaeota, 18, 427, 429 
KtrA potassium transporter, 235 
KvAP, of Aeropyrum pernix, 358 


Lactose transporter, 358 
Large Cluster of Tandem Repeats clusters, 420 
Last Universal Common Ancestor (LUCA) of extant 
cells, 175 
Ligands, allosteric, binding domains for, 236-237 
Light, for activation, 225 
Lipases, 482 
Lipid-anchored secreted proteins, 232 
Lipids, biosynthesis enzymes, distribution of, 341, 342 
biosynthesis of, evolution of, 347-350 
function, and evolution of, 341-353 
biosynthetic pathways of, 342-347 
core, distribution of, 341, 342 
functions of, 341-342 
cyclic, 350 
cyclization of, 346-347 
diversity of, 341 
ester, 350 
ether, stability of, 350 
glyceral diether isoprenoid, structure of, 341, 342 
glycerol ether isoprenoid, biosynthesis of, 342-346 
glycerol tetraether isoprenoid, structure of, 341, 344 
hydrogenation of, 346-347 
in biotechnology, 496 
in membranes, 354 
and adaptations to environmental changes, 355 
isoprenoid, biosynthesis of, stereospecific enzymes in, 
348-350 
specific stereochemistry of, 348-350 
structure of, 341-342 
tetraether, modification of side chains of, 346-347 
types in cell membranes, factors influencing, 350-351 
Liposomes, 496 
proton and glycerol permeability of, 498-499 
Listeria monocytogenes, archaeosome vaccines and, 
506, 507 
Lrp family, paralogs of bacterial proteins with repressor 
and activator activities of, 147-148 
LysM, functions of, 147-148 


Maltodextrin-binding protein, 360 

Maltose-binding protein, 360 

Mannosylglycerate, 489 

Marker frequency analysis, 416, 418 

Maximum likelihood trees, unrooted, based on ribosomal 
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proteins, polymerase subunits, and transcription 
factors, 427-428 
MDR1, metal-dependent repressor, 148 
Mechanosensitive channels, 231, 357-358 
structures of, 358 
Membrane proteins, 365 
integral, of Sec substrates, 369-370 
Membranes, of Archaea, 21-22 
Mesophiles, amino acids in gene coding sequences 
from, 425 
Messenger RNA. See mRNA(s) 
Metabolism, central, 260-287 
pathways of, 260-279 
Metabolites, allosteric regulation by, 233-237 
Metagenomics, 78-79 
Metallosphaera sedula, proteomic analysis of, 443 
Metalloproteins, in Pyrococcus furiosus, 52 
Methane, anaerobic, oxidation by methanotrophic 
archaea, 77-78 
and carbon dioxide, dismutation of methanol and 
monomethylamine to, 299, 300 
discovery of, 71 
importance of, 71 
reactions leading to, 299-300 
in carbon dioxide reduction and acetate fermentation 
pathways, 294-295 
Methanobacteriaceae, 72, 289-290 
Methanobacteriales, 18, 72-73, 289-290, 428 
Methanobacterium, 72, 73 
Methanobacterium thermautotrophicum. See 
Methanothermobacter thermautotrophicus 
Methanobrevibacter, 72, 73, 289 
Methanobrevibacter smithii archaeosomes, 497, 498, 500, 
501, 502-503, 505, 508 
Methanocaldococcaceae, 290 
Methanocaldococcus jannaschii, 199, 201, 202, 435-437 
cysteine in, 201 
heat shock proteins in, 212 
prefoldins and, 212 
proteomic analysis of, 438-439 
splicing endonuclease of, 199 
Methanococcaceae, 290 
Methanococcales, 73-74, 290 
amino acid composition and primary structure of, 
325-327 
cell wall of, 322, 323-328 
gene sequences of, 324-325 
high-level gene expression in, regulatory sequences 
controlling, 324-325 
(hyper)thermophilic and mesophilic, molecular 
characteristics from S-layer proteins of, 325, 326 
S-layer genes from, 324-325 
S layers of, 321-322 
secondary structures of, 327-328 
sequence comparison in, 328 
signal sequences for secretion from, 324, 325 
thermostabilization in, 326-327 
transmission electron micrographs of, 322 
Methanococcoides burtonii, 303, 351, 355 
proteomic analysis of, 439-440 


unsaturated lipids in, 350 
Methanococcus, 471-473 
glucose metabolism in, 266 
Methanococcus jannaschii, 103-104, 231, 235, 355, 362, 
394, 411, 414, 415 
S-layer protein of, hydropathy profile of, 327 
SDS-PAGE of S-layer protein of, 326, 329 
Sec61 protein-conducting channel, structure of, 373, 
374, color illustration for Figure 3 (Chapter 17) 
signal sequences of, 371 
transcriptional response analysis of, 450 
Methanococcus maripaludis, 190, 201-202, 235, 395, 
435-437, 467 
Methanococcus maripaludis genome, 182 
Methanococcus voltae, 302, 386, 467 
protein composition of flagellum, 23-24, 394 
Methanocorpusculaceae, 291 
Methanofuran, 292 
Methanogen chromosomal protein 1, 115 
Methanogenesis, 288-314 
biochemistry of, 72 
pathways of, 292-300 
by dismutation of methyl groups of one-carbon 
compounds, 299-300 
coenzymes and cofactors in, 292, 293 
molecular biology of, 301-304 
physiology of, 71-72 
Methanogens, 14 
characterization of, 2 
“cousins” of, 4 
cultivated, 71 
discovery of, 288 
ecology of, 288-289 
methanotrophic, 71 
morphological and biochemical characteristics of, 288 
natural habitats for, 71 
phylogeny of, 16-18, 289-292 
proteomics of, 438-441 
rRNA catalogs of, 2 
selected species of, characteristics of, 75-76 
transcriptional response analysis of, 450 
translational autoregulation of, 192 
Methanohalophilus mahii, 234 
Methanol, and monomethylamine, dismutation to 
carbon dioxide and methane, 299, 300 
Methanomicrobiaceae, 290 
Methanomicrobiales, 74, 290-291, 427-428 
Methanophenazine, 292 
Methanopyraceae, 292 
Methanopyrales, 292 
Methanopyrus, 73 
two-layered cell walls of, 328 
Methanopyrus kandleri, 115-116, 292, 328, 388, 423, 
424, 427-428 
cysteine in, 201 
Methanopyrus kandleri genome, 182, 184 
Methanosaeta, 292 
proteinaceous sheaths covering cells of, 323 
Methanosaeta concilii, 323 
Methanosaetaceae, 77, 292 
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Methanosarcina, 388, 465, 472-473 
acetate fermentation pathway of, 297-299 
group II chaperonins in, 219 
H; : CoMS-SCoB oxidoreductase system in, 296-297 
Methanosarcina acetivorans, 209, 212, 219, 302-303, 
435-437 
Fd: CoMS-SCoB oxidoreductase system in, 
298-299, 300 
proteomic analysis of, 440 
Methanosarcina barkeri, 247, 467 
Fd: CoMS-SCoB oxidoreductase system in, 297, 299 
gas vesicle genes in, 38, 39 
methylamine methyltransferase genes of, 202-204 
Methanosarcina mazei, 212, 219, 235, 246, 
302-303, 437 
F420H2 : CoMS-SCoB oxidoreductase system in, 
300, 301 
transcriptional response analysis of, 450-451 
Methanosarcina thermophila, 302-303, 435-437 
proteomic analysis of, 440-441 
Methanosarcinaceae, 77, 291-292 
pyrrolysine in, 202-204 
Methanosarcinales, 72, 74-77, 291 
Methanosphaera, 72, 73, 289 
Methanosphaera stadtmanae, 297, 300 
Methanospirillaceae, 291 
Methanospirillum, proteinaceous sheaths covering 
cells of, 323 
Methanospirillum hungatei, 291, 323, 394 
Methanothermaceae, 72, 289-290 
Methanothermobacter, 72, 73, 465 
Methanothermobacter marburgensis, 301-302 
Methanothermobacter thermautotrophicus, 95-96, 97, 
98, 124, 204-206, 210, 245, 247, 266, 270, 276, 
301-302, 351, 414, 467 
cysteine in, 201 
GatDE, structure of, 204, 205, color illustration for 
Figure 4 (Chapter 9) 
GC skew analysis for, 414-415, 416 
MthK of, 358 
origins of, 95-96 
redox regulation of metabolism in, 245, 246 
S-layer protein distribution in, 324, 325 
structure of prefoldins and, 211, 212 
Methanothermobacter wolfei, 465 
Methanothermus, two-layered cell walls of, 328 
Methanothermus fervidus, 111, 113, 328, 330 
oligosaccharide of S-layer glycoprotein of, 328, 330 
Methanothermus jannaschii, 113-114 
Methylated proteins, 246, 247 
Methylation, 245-247 
Methyltransferase-activating protein, 238 
Mevalonate kinase, 347 
Mevalonate pathway, origins of, 347 
Microbiology, historical development of, 6-7 
Microorganisms, in plant’s biosphere, 9 
Miniature Inverted Repeat Transposable 
Elements, 420 
Minichromosome maintenance (MCM), 97-98 
AAA* domain of, 98 


DNA unwinding by, 98 
domain organization of, 97 
Mismatch repair, alternative form of, 127-128 
proteins responsible for, 125-127 
Model organisms, archaeal, and taxa, 31-78 
Molecular and cellular features, unique, 120 
Molecular biology, enzymes in, 485 
evolution of, 10-11 
Molecular functions, coordination and modulation of, 
224, 225 
Molecular genetics, 463-477 
Molecular regulatory mechanisms, 224, 228-229 
Molecules, from Archaea, 486-489 
Molybdopterin guanine dinucleotide, 292 
Monomethylamine, and methanol, dismutation to carbon 
dioxide and methane, 299, 300 
Monosaccharides, metabolism to pyruvate, 260-271 
MpcT, 403-404 
MthK, of Methanobacterium thermautotrophicum, 358 
Multidrug transporters, 363 
Multilocus sequence typing, 132, 133 
Mutagenesis, random, 469 
Mutation, avoidance of, 133-134 
rate of, parameters for calculation of, 122 
spontaneous, 121-125 


N-terminal signal peptides, 370-371 

NADH: quinone oxidoreductase, rotenone-insensitive, 69 
rotenone-sensitive, 68-69 

Nanoarchaeota, 242-243 

Nanoarchaeum, fluorescence images of, 57, color 

illustration for Figure 20 (Chapter 2) 

growing at neutral or alkaline pH, 42 
surface layer of, 333-334 

Nanoarchaeum equitans, 18, 57, 198-199, 209, 423, 428 
split tRNA genes in, 200 

Nanoarchaeum 16S rDNA sequence, 57 

Nascent polypeptide-associated complex, 213 

Natrialba magadii, 386 

Natronobacterium pharaonis, 71, 361, 404 

Natronococcus, glutaminylglycan in, 331-332 

Natronomonas pharaonis, 37, 38 

Nitrogen fixation, in methanogens, NrpR in, 148 

Nitrogen metabolism, 235 

Nitrosopumilus, 60 

cold-adapted, characteristics of, 46 

NrpR, regulator of nitrogen fixation in methanogens, 148 

NSF proteins, origin of, 211 

Nuclear antigen, proliferating cell, 101-103 

loading of clamp of, 102-103 

Nucleoids, bacterial chromatin proteins and, 110-111 

Nucleotide excision repair, alternative form of, 128 

Nucleotides, composition of, and codon index, 425-426 
cyclic, 232-233 


F 


H 


Oligonucleotide/oligosaccharide-binding fold, 99 
One-carbon compounds, methyl groups of, dismutation 
of, pathways for methanogenesis by, 299-300 
One-carbon dismutation pathways, regulation of 
genes in, 303 
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Organic precursors, in seafloor hydrothermal mounds, 
origin of cells from, 19, 20 

Osmolarity, 231 
Osmotic stress, microbial strategies to cope with, 32 
2-Oxoacid dehydrogenase, 277 

multienzyme complexes, 277-278 
Oxygen, and other gases, receptors for, 225-231 
Oxygen reductases, 70 


Pentose phosphate pathway, 268-268 
Peptides, N-terminal signal, 370-371 

signal, classes of, 371, 372 
PH0S512, 240 
Phenylalanine, 234 
Phenylalanyl-tRNA synthetase, 238 
Phosphatases, protein, 240 

distribution “pattern” of, 240-241 
Phosphates, inositol, 233 
Phosphoaspartate phosphatase CheC, 244 
Phosphoenolpyruvate carboxylase, 235-236 
Phosphohexosemutases, 238 
Phosphohistidine phosphatase Six, 244 
Phosphomevalonate kinase, 347 
Phosphoproteins, 237-239 
Phosphorylation, protein-serine/threonine/tyrosine, 237-241 
Phosphoserine, 241 
Phosphothreonine, 241 
Phosphotyrosine, 241 
Photoreactivation, putative, DNA repair and, 125 
Phototaxis transducers, 401-402 
Phylogenetic dendrogram, 16S rDNA, 15 
Phylogenetic tree of Archaea, 318-319 
Phylogenetic tree of life, 179 
Phylogeny, and origin of life, 15-19 
Physiological characteristics, and growth efficiency, 
on solidified medium, 463-464 

Picrophilus, 61 

intracellular pH of, 355 

S layers of, 322 
Picrophilus oshimae, 265, 424-425 
Picrophilus torridus, 265, 425 

morphology and ultrastructure of, 62 
Pili, in Archaea, 24 
Poly-B-hydroxybutyrate, 488 
Poly(ADP-ribosyl)ation, 247-248 
Polymers, 488 
Posttranslational modification, sensing, signal transduction 

and, 224-259 

Potassium transporter, TtrA, 235 
PPM family, 240 
PPP family, 240 
Preflagellin peptidase genes, 392 
Prefoldins, 211-212 

cooperative action with chaperonins, 212 

structure of, 211, 212 
Procaryote, course of microbiology and, 8 

deconstructing of, 6-7 

dismissive attitude toward, 8 

intent of, 8 

new, 7 


Prokaryote, belief in, 2-3 
Proliferating cell nuclear antigen, 101-103 
loading of clamp of, 102-103 
Proline-betaine-binding protein, 360 
Protease, 479 
Protein chaperones, repertoires of, genome size and, 
209-210 
Protein complexes, RNA-degrading, 167-168 
Protein-folding mechanism, of group II chaperonins, 
215-217 
Protein-folding systems, 209-223 
Protein phosphatases, distribution “pattern” of, 
240-241 
PPM family of, 240 
PPP family of, 240 
Protein-serine/threonine/tyrosine phosphorylation, 
237-241 
Protein-serine/threonine/tyrosine kinases, 237-238, 239 
Protein synthesis, elongation in, 190-191 
steps in, 185 
Protein targeting, in Sec translocation pathway, 
371-375 
Protein taxonomy, 8 
Protein-tyrosine phosphatases, 240 
Proteinaceous surface layers, 315-340 
Proteins, adaptations of, 209 
ATP-binding, 362-363 
chromatin. See Chromatin proteins 
evolution of regulation of, 249-250 
in ribosomes, 177-179 
informational, 412 
integral membrane, of Sec substrates, 369-370 
membrane, 365 
methylated, 246, 247 
methyltransferase-activating, 238 
nascent-associated, 213 
posttranslational modifications of, 244-248 
S-layer, sequence homology of, 328, 329 
substrate-binding, 360-362 
anchoring of, 360-362 
translocation into and across cytoplasmic membranes, 
369-384 
transport, 357 
Proteolysis, regulated, 248 
Proteomics, 437-443 
of halophiles, 441-442 
of methanogens, 438-441 
of nonmethanogenic, nonhalophilic Archaea, 
442-443 
Proteomics studies, 280 
Proton motive force, 354, 355 
Ptr2, functions of, 147 
Puromycin, 467 
PyIRS, pyrrolysyl-tRNA?»! formation by, 203 
Pyrobaculum aerophilum, 58, 60, 177, 181, 182, 
198-199, 212 
Pyrococcus, galactose metabolism in, 267-268 
glucose metabolism in, 265 
Pyrococcus abyssi, 211, 265, 422 
a/eIF2 function in, 188 
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genome organization of, 418, 419, 420—421 
transcription skew lines in, 421 
Pyrococcus furiosus, 51-52, 210-211, 213, 245, 247, 
265, 270-271, 273, 276, 360, 422, 435, 467-468, 
473, 479 
dissimilatory APS reductase, crystal structure of, 54 
DNA microassays, fluorescence intensities of, 447, 448 
metabolism of, influence of carbon source on, 
447-449 
influence of sulfur on, 444-447 
metalloproteins, sugar, sulfur and hydrogen metabolism 
in, 52 
proteomic analysis of, 443 
response to cold shock, 449-450 
structures of tungsten-containing aldehyde: ferredoxin 
oxidoreductases, 52, 53, color illustration for 
Figure 18 (Chapter 2) 
transcriptional response to thermal stress, 447 
Pyrococcus genome sequences, 51-52 
Pyrococcus heat shock regulator, interaction with heat 
shock promoters, 149, 151 
Pyrococcus horikoshii, 232, 238-239, 240, 241, 
414, 422 
GC skew analysis for, 414-415, 416 
genome organization of, 418, 419, 420-421 
glutamate transporter of, 358, 359 
transcription skew lines in, 421 
Pyrococcus RNA polymerase, and yeast RNA polymerase 
II, structural similarity of, 144-145, color 
illustration for Figure 4 (Chapter 6) 
Pyrodictiaceae, 57-58 
growth of, 58 
Pyrodictium, extracellular cannulae of, 333 
Pyrodictium occultum, transmission electron micrographs 
of, 333, color illustration for Figure 16 (Chapter 14) 
Pyrrolysine, 190, 303-304 
as lysine derivative, 190 
encoding of, 190 
in Methanosarcinaceae, 202-204 
in polypeptide chain, 190-191 
Pyrrolysyl-tRNA?”!, formation by PyIRS, 203 
Pyruvate, labeling of, during glucose catabolism, 261 
metabolic fate of, 271-276 
metabolism of glucose to, 260-267 
metabolism of monosaccharides to, 260-271 
oxidative decarboxylation of, 271 
Pyruvate ferredoxin oxidoreductase, 271, 272 


Receptors, 224-232 
cryptic, 232 
functional classes of, 225 
Regulation, allosteric, by metabolites, 233-237 
Replication origins, of chromosomes, 414-417 
Replication proteins, in Crenarchaeota and 
Euryarchaeota, 16 
Replicative polymerases, 101 
Replichores, chromosomes, 417-418 
Respiratory chain, canonical, in bacteria and 
mitochondria, 68, color illustration for Figure 28 


(Chapter 2) 

Response regulator domains, dephosphorylation of, 244 

aRF1, 191 

Rhodopsins, sensory, 225 

Ribonucleotide reductase, 236 

Ribosomal RNA, maturation of, 162-164 

Ribosome/mRNA interaction, in translation initiation, 
185-187 

mechanism of, 186 

Ribosomes, architecture of, 176, 179-180 

biosynthesis of, 180-181 

chemical composition of, 176-177 

historical review of, 176-177 

proteins in, 177-179 

70S ribosome, 3D structure of, 29 

Riol and Rio2, 239 
RNA, ribosomal, maturation of, 162-164 
size and number of genes for, 177 

16S, secondary structure of, 177, 179 

small noncoding, aL7Ae binds to, 165 

transfer. See Transfer RNA 

rRNA, characterization of bacteria by, 10-11 
tRNA, aminoacyl. See Aminoacyl-tRNA 

biosynthesis of, 198-200 

direct aminoacylation of, 200 

generation of functional molecules of, 198 

individual promoters of, 198 

RNA-degrading complex, exosome-like, 167-169 
mRNA-degrading enzymes, bacterial type, 167 
RNA-degrading protein complexes, 167-168 
rRNA genes, 413 

size and number of, in genomes, 182, 183, 

color illustration for Figure 3 (Chapter 8) 

rRNA nucleosides, modification of, 164-165 

tRNA genes, 413 

split, in Nanoarchaeum equitans, 200 

tRNA ligase, 162 

tRNA nucleosides, modification of, 164-165 

tRNA operons, structure of, and processing of ribosomal 
RNA precursor, 162-164 

RNA polymerase(s), 143-145 

amino acid sequences of, phylogenetic dendrogram of, 
16-18 

from three domains, subunit structure of, 143-144, 
color illustration for Figure 3 (Chapter 6) 

Il in yeast, and Pyrococcus RNA polymerase, 
structural similarity of, 144-145, color 
illustration for Figure 4 (Chapter 6) 

mechanism of transcription by, 145-147 

of Pyrococcus, and yeast RNA polymerase II, structural 
similarity of, 144-145 

reconstitution of, 144 

structure of, 139, 140 

RNA processing, 158-174 

RNA-RNA interactions, 426 

RNA sequencing, 1-2 

mRNA translation, leaderless, mechanism for, 192 
tRNA™'s guanylyltransferase, 199 

mRNA(s), and ribosome interaction, in translation 
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initiation, 185-187 
mechanism of, 186 
as leaderless, 186-187 
polyadenylation of, 169 
principles of decay of, 165, 166 
processing of, 165-169 
recognition and initiation of translation, 29-30 
stability of, 166-167 
structure of, 181 
RNase P, 198-199 
maturation of tRNA 5’ end by, 158-159 
RNase P RNPs, box C/D methylation guide, 164 
box H/ACA pseudouridylation guide, 165 
tRNase Z, processing at tRNA 3’ end by, 159-160 
ROMA approach, 152, 153 
RTCB, 238 


S-layer proteins, sequence homology of, 328, 329 
S layers, glycosylated, 316 
of Crenarchaeota, 317-321, 322 
of Desulfurococcales, 321 
of Euryarchaeota, 321, 322-323 
of Sulfolobales, 319, 320 
structural features of, 316-317 
5S RNA genes, 182 
Saccharomyces cerevisiae, 210 
Sec pore, components of, 375 
signal recognition particle-dependent targeting to, 
372-374 
signal recognition particle-independent targeting to, 375 
Sec substrates, extracellular protein structures of, 
371-372 
integral membrane proteins of, 369-370 
Sec signal sequences of, 370 
secreted, and N-terminal signals, 370-371 
Sec translocation pathway, 369-377 
energetics of, 376-377 
protein targeting in, 371-375 
Sec pore-associated components of, 375, 376 
Sec pore of, 375 
Second messengers, 232-233 
SELB (specialized elongation factor), 190 
Selenium, 190 
Selenocysteine, 190 
Selenocysteine insertion structure, 190 
Selenoproteins, 190 
Sensing, signal transduction, and posttranslational 
modification, 224-259 
Sensor-response machinery, 224-225, 230-231 
terms and abbreviations applicable to, 224, 226-227 
Sensor-response mechanisms, undiscovered, 250 
Sensor-response pathways, basic elements of, 224, 
228-229 
Sensory rhodopsins, 225 
Short Regularly Spaced Repeats clusters, 420 
sHsp, effects of, 214 
Shuttle vectors, 466, 468—469 
Signal peptide classes, 371, 372 
Signal recognition particle, composition of, 372 
importance of, 373 


interaction with ribosome, 373 
Signal recognition particle-dependent targeting, 
to Sec pore, 372-374 
Signal recognition particle-independent targeting, 
to Sec pore, 375 
Signal transduction, halobacterial, 404, 405, color 
illustration for Figure 8 (Chapter 18) 
Single-stranded proteins, DNA-binding, 98-99, 142-143 
composition of, 99 
SixA phosphatases, 244 
Sliding clamps, 100, 101 
Sodium-dependent glucose transporter, 358 
Sodium motive force, 355 
Solfatara and Pisciarelli fumaroles, 41, color illustration 
for Figure 15 (Chapter 2) 
Solute transport, 354-368 
Solute transport systems, classification of, 355-356 
distribution of, 357-363 
number of transporters per Mb of genome, 357 
Solutes, compatible, 488-489 
Splicing endonuclease, 161-162 
Spontaneous mutation, 121-125 
SsoPK2 and SsoPK3, 239 
SSU rRNA genes, 182 
Staphylothermus marinus, 321 
S-layer protein tetrabrachion, 23 
Starch, enzymatic hydrolysis of, 482 
Stereospecificity, of lipids, 348-350 
Stop codons, in termination of translation, 191 
Streptomyces alboniger, 467 
Stygiolobus, 64 
Substrate-binding proteins, 360-362 
anchoring of, 361 
Succinate : quinone oxidoreductases, 69-70 
Sugar transport regulator, TrmB, 151-152 
Sul10a, 115 
Sull10b/Alba, 114-115 
Sul7d, 114 
Sulfataras, 60-62 
Sulfide, oxidation in A. ambivalens, 67 
Sulfite, oxidation in A. ambivalens, 67 
Sulfolobales, 61 
cell morphology of genera of, 63, 64 
phylogenetic dendrogram based on 16S rDNA 
sequences, 64, 65 
S layers of, 317-319 
thermoacidophilic, characteristics of, 48 
Sulfolobus, 63-65, 133, 354 
ATP-binding cassette transporters in, compared, 362, 364 
crenarchaeal, cell cycle of, 103-104, 105 
galactose metabolism in, 267 
glucose metabolism in, 264-265 
intracellular pH of, 355 
mevalonate pathway and, 347 
origins of, 95-96 
proteomic analysis of, 442-443 
putative S-layer glycoproteins and membrane 
glycoproteins in, 319, 320 
Sulfolobus acidocaldarius, 61, 63, 114, 122-123, 130, 
131, 132, 363, 435, 467-468 
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GGPP synthase, 347-348 
oxygen reductase complexes of, 70-71 
S layers of, 317-319 
Sulfolobus shibatae, 210 
Sulfolobus solfataricus, 61, 99, 101, 114-115, 116-117, 
181, 213, 236, 238, 239, 240, 241, 244, 246-248, 
264-265, 271, 278, 281, 355, 360-361, 361, 362, 
363, 373, 419-420, 422, 435, 473, 479 
a/eIF2 function in, 188 
crystal structure of, 100 
insertion sequences in, 124-125 
signal peptides of, 371 
signal sequences of, 371 
time-dependent response to pH shift, 444, 446 
time-dependent response to temperature shift, 444, 445 
Sulfolobus tokodaii, 422, 435 
Sulfur, and inorganic sulfur compounds, metabolism by 
hyperthermophilic archaea, 41-49 
Sulfur cycle, bioinorganic, 66 
Sulfur-dependent thermophiles, 14 
Sulfur oxygenase reductase, 66 
crystal structure of, 66-67, color illustration for 
Figure 27 (Chapter 2) 
cysteine residues in, 66 
Sulfurisphaera, 64 
Sulfurococcus, 64 
Systems Biology, goal of, 281 


Tat (twin-arginine translocation) machinery, 379-381 
Tat (twin-arginine translocation) pathway, 377-381 
deduced from complete genome sequences, 379 
Tat machinery, 379-381 
Tat signals of, 370, 378 
Tat substrates of, 378-379 
Tat (twin-arginine translocation) signals, 370, 378 
Tat (twin-arginine translocation) substrates, 378-379 
TATA-binding protein, 139, 140, 141-142 
TATFIND, 378, 379 
Taxis, mechanism of, 404-406 
Temperature, environmental, response to changes in, 355 
Termination factors, class 1, 191 
Tetrahydromethanopterin, 292 
TFS5 chaperonin-like protein, 49 
TFB, domain structure of, 141-142 
promoter-bound, 139 
structural organization of, 141 
Thermococcales, 18, 49-52, 428 
defined genera of, 50 
hyperthermophilic, growing at neutral or alkaline pH, 42 
rapid growth of, 50 
S layers of, 319, 322 
transmission electron micrograph of, 319 
Thermococcus, glucose metabolism in, 265-266 
Thermococcus celer, 50 
Thermococcus kodakaraensis, 50, 231, 238, 422, 424, 
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Thermococcus litoralis, 360, 363 
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Thermococcus zilligii, 268 
Thermofilum, structure of, 58, 59 
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glucose metabolism in, 265 
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proteomic analysis of, 443 
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pH, 45 
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S layers of, 321 
Thermoproteus, glucose metabolism in, 267 
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Thermotoga maritima, 266, 360, 424 
Thermotolerance, chaperones and, 210-211 
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TRAMP complex, polyadenylation of transcripts by, 169 
Transaminases, 485 
Transcription, 26-27 
chromatin proteins in, 116-117 
initiation of, 139, 140 
major transitions in, 145-147, 148 
mechanism of, 145-147 
and regulation of, 139-157 
regulation of, 147-152 
transcription factors in, 141-142 
transcription signals in, 140-141 
Transcription factor IIB, 139, 141-142 
binding by, 144-145 
Transcription factor IIE, 139, 142 
Transcription factor IIH, 139, 142 
Transcription trees, based on ribosomal proteins, 
polymerase subunits, and transcription factors, 
427-428 
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443-444 


INDEX 523 


Transcriptomics, 443-452 
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phototaxis, 401-402 
responding to membrane potential, 403-404 
Transduction, 131, 468 
sensing, and posttranslational modification, 
224-259 
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Figure 8 (Chapter 18) 
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addition of 3'-terminal CCA sequence, 160 
3’ end, processing at, by tRNase Z, 159-160 
nucleotidyltransferases, classes of, 160 
Transfer RNA ends, maturation of, 158-160 
Transformation, 468 
Transformation systems, gas vesicles and, 38-40 
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cellular function and, 175-176 
components of, genome organization of, 182-185 
elongation in, 190-191 
initiation of, 185-190 
leaderless mRNA, mechanism of, 192 
mechanism of, 185-192 
regulation of, 192, 193 
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176-182 
Translation initiation factor(s), 184-185 
evolution of, 187-188 
function of, 193 
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IF2-like, structure of, 189 
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polymerase subunits, and transcription factors, 
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Transmembrane receptors, 248 
Transmission electron micrograph(s), 319 
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secondary, 358-359 
Trehalose, 488-489 
Tricorn peptidase, 63 
Tricorn protease, 63 
Triticum aestivum, heat shock proteins in, 212 
TrmB, sugar transport regulator, 151-152 
TrpY, regulation of tryptophan operon by, 148-149 
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synthesis of, 148-149 
Two-component systems, 241-244 

distribution of, 241 

distribution of ORFs encoding potential in, 242-243 

physiological roles of, 243 

transduction cascades, architecture of, 241, 242 
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Uracil phosphoribosyltransferase, 236 
Uracil-sensitive DNA synthesis, 128-129 
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fundamental characteristics of, 3 
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subunits of, 370 
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